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IMRODUCTION 



I.I VISION: A MOBILH VIDKO COMMl MCaTOR 

\lii.5i-ri! ii-i-5'jwltu;v liktr iflcphsmc unJ c ii>.fi oti*-ts :h? p..>%«ibi5n> «ji x: eiiu-u-w in- 

ro:i};.tinm cxchaiiJc hinv^ecit people. *l J.Cn;- u*:hm»!iV.;:cN enable wrK.-/ v-r/Jif*::!^:!. 
.♦fi.^:*. b.ii imi:i;Mri.soi:2l txaKiUitnv'arsot^s ai-t; cvinsts-> ol s ?r-;.«^*.vW:,i.' pan . h'>;c ^.i.vuai 
>:csUiie>\ e\prc\siML». c g . ihr nti\>d ot the prr^im vuu aie Ui'tttii?^ -^ nh or makm,u llw 
«ufcjc*ri ordiscii>5ton clear to VHur pjrtnt'f. I oi tf\ji:ipk- :% b;:sin'.*.ss nian nci'd^ u> 
ihc face of his buswicss pjnnvr, to he able to dt ciJc x^hcihcr bi^ fiHcr sct'in-* la be rvJi- 
sorahlc fat Ivis fMrtnrr ?n nol. I'hvrt'fori'. M>ual o>inir»irt:v'aiii>ns is icjiaitltrd l«» Ix- c^- 
scnnal lit d\Oui riv.suniicrstandjrj^'S hcJ'AVff: f^'oplf, (h'curnng, c ji., because ol ilir 

To sahsfy Uic c\ci mcrcD'^^ni: dcrtvjn;! iouarij> \ frtjnrtiunicjtifms in an uuictis 
in^jly fSiobiU" SfKtCiv. a rriorilL' vidi'o vunimunuaM '-^ iih ipjtTOct access cnx i>,Ji!C** 
3s dciiictcd to I- ic. J . i I he corninuiuCtiUu i 5 % idw* v^mivta. 

jnd J radio Isiik ii>r coiine^:Uun j :\?is:k-t 'Ajihin :hc* {n:t*mt:: i>t CitiiHvsti: nc:- 
%vi>rk lo enable miuuive u>a*re fv; 3 bfi>3d fft:t|:t- r.i s-scts, \hc \ uhiy iromrmmio^inr 
i-ir.plRvs a pt-n tor naunal usvt inuraoiiort |Kjp!.in IM] ^oflw;iir is-AiaiW on 

tht* vuk<j cfj:mmjau*;4io; is, lot < \ajnpU\ au Insfinirl brl»u^c; .sujiit«j3!iOi: MTIXt-J 
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MPLij i> ;in ^cnjiiviii tin Mmm^ Puftirfs iApcjSi ith^\x^ am! iV-xjth* thj \mt ux?. 
ttOii2\ Siariilardi.-2liu::'%%ork:r4i gr*xip J.SC.) UK j|l I ^(:2'^ V.V.: I NiPI it-: cnahlo 

nn: ami drnviirst: t'Ms uh the Juaip-\ isuaJ <tb;vvi-. It.- u>rr i!i;:i!:tt-i- ^apafuhn^s ami 
for M 1*1-1 i "I h3>jil'wi bcgMu. 



1.2 ENABLING FACTORS 

1. 2 J .Ml'K(;-4 Standardi/ation 

One of the fiuhliiip lactojs- fot niohili* miihsnirdra comniumcaitiT.N is Ihc ctrieri'in" 
MIM:0-4 ;iudKiMvujl iiapdard, '.vhich i^cxpccwd ui become ir* larairjuon:;! i>iir.djid 
(IS) Hit Ml*h*.**4 vmitrn I in J ami a p.* 3 W>. MiU-U 4 version 2 i> cNpccivd lo bccwnie 

U) several viYnipijii:c> fM^am^^ttiiJisv. UHl^ kuvhjiv! Ut iMic!t»prKibih!y :ia>iiiH;is 
and hcalihs ji^mpciKnc on jfOtmicnL NvJc lluil MVWO fmsKdiiy sumkrUi/csonly 
tlic bi! stream lomiai the iooN ot ihe dcvt^k-i (htfif fo;c. r«iHc fi<r«t<>rri in irrim 

Only u shi-Mt drscTipiitin kirul pt-r^oiiiii \ ivw t>fi M1'| j:i.4 can be ci\t-n hcjc Dcaikd 
and up-i*>iiau- mfonnjurtii iin MPK^ i-J %i'f>«)n I ;tr»<! v^r^n^n can h*: Ic-uniJ at 
IMiTGI. I^Hik iidd {Sik *>7hJ prescmcd ii drKiij?tiotr nfth^ b^M.: piiuapks. i>f 
MPl:0-4, Ji'uii described □ peribiiiuncc c^^Ji^iU -M <«rihv MPD j-4 \}>,v;x1 a^dm^i 
standard. JKuhn 9^al dcj^cabcd a cnnipkvhy :!MaLvv> i^r.iii tr.trly ^viisuuj^: 
impicrnci«'4sion and IR-r^^rtp^)^.} d}^cuss,cd V{ Si ispi-ij^tti MJ'f f • 4 

Ku fiction:! lilies 

i hc MlMiti-4 sUMd^id scl> & cciitmioft liaH> widi a jt;^lm:^w*i^■ fvl Ufi^is ii>} oiui;»nrfiii 
-^ippiicaitiDiLi. consist ing fli naiiir^il iimi xjruhcirc vidc-:» i \»suj|i .md sy.srmm 

The new wi imp-a\'?rd fuiKtionaluks ur MI*Lti-4 arc [Per Of:} 

• liybMd nanii^il and s^vnthclic 
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trttrooucitcn • 
li^.Ll Video ctHlint: 

cepiJMus^5|,.\civ->|. lie i :a.a.<usod^Asi!uiiihclSO Iht XirhO MrKj.2.*-»nJ 

vvhith h2\i b'CC'A c\wnd<d u:ihir. ihc iian^ardr/auon cIToi: lo suppou jfb: 

Csurilv' shipped Milco obiccis. Ilu* «thilrajil%-ih.J|X'vi m<1vo ob;ecK of \|l*Hi--. Fn: 
I 2K arc i^pai up mu» «ri.ic;i>-bkK*VN »'M(is. I'*\l<* |Vi» ^\i5h«t .1 Miruiitiji biA. ct Ku; 
ZXk JJid asV CimIvJ ua NJB .icuj Wixk jvi> ba^iS Siiuiia: ut nlix l: h^scil video t.<«n*i- 
l>r<r>Mtirt schemes hui v\ nh ihr c»>din»; i»^iU oJMJ'h'i'i'J 

lion 3ihI In jiasninl b\ ^r. applicanoji-tlepcnilcni rsurihosl vvhrch is hc\onti sl^ndaiiU/a- 
fton, cf <h:ipu-t ^ ^'tsisal ohjcvi:% can hi- ir jastucenu chantjc iheir ^t/f:, *hi-N can he uf 
naiurai \ uitu objcci {\'<">» or !hiy m:i\ he ^ynthctu' (atn'ipiUrr gor^'ratfJ) ^»li:cv!> 
which can bi» nui^:pii!AU*4i b\ the \:Si-i. I- vrrj t>hj';*iri i>> cnct.-nl^'ii ami ci^ct^i!c•d b\ a JiT- 
rcTcni cftiTdiltT iukI dccodi*s LiisUMvc aiui mav use differ em c»Kiini' *^ptit>rii^ The viJeo 
objijc; rt:p:i:srniat»on at a vpcjiiic nine in-»uncc called video iHmcvI piant* |VOP> 
which t% th<- Cfc|it:v jlrni j»» "framtr* loi b!fK:k-bast*d vidto. 

i.2.L3 Applic'jtions 

Typical ripplw^iitiMiN 4»r"MPf'ri-4 tiu-ludc scij^'h hi nuiUtmrdui daiabaivs. ictt:%^oik- 

di.sutbtiuiMt Willi hiyh bii aunuMil.iuun \ i;Jvo .sc^uetiLirs rVn li^«nic \ uico p:*?- 
d*.5cC;on. mubiU niuni*au'<iia. intei:t:!»:u-iii. nur.unKullj ;t-^nips* ^uncigljiuc. DVD 
4d;};:ial vcisauU- disk J. cc»r«crt'hj.»iccj sK'M^ic- and rcincval. sfu:amtn>: \uU*oon ihc hi- 
tcrnei-'iniraniv., cluiital fci :op ho\ arid nuuy more |N 21*>M. 



4| Rl«<L4ni^r.l %ttlf.i i i^ttift- MPf r.>l. \lt*l i » II II :ij 



■■■B&B>SaHB 
■■H^riBBBIilB 
BBRm iBBBBB 
BBHar IBBBBB 
BBI'if'ILaZBP'^liB 
BBUBBBB - J 

B^i^flBBB V : ' m 

BBBMBBBa It^il 

btilbjrrMiii^ii} ^Kl#a«it||i»«** MI'KO-4 
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1.2. 2 Athanccs in mobile coniniunications 

^o: J rcil ?tnrj \:druc»»mmi_n!CJUo!j\ app!ttan»»r. She niininiuin ruHxvvCA nj:v!'.MJ*h i> 
esnr:i:iicd to he M Ivjsi ^-l Lb:t s fur !»y.v moiio:: a Ah ^kyjiU st'>sM:i ^-l i i^t 

be ai Ica'vt IS khu ibf jcnen** \:iiri' U*lcp?iont apphtaiHirs Him ever. K ^li.l.n^' 

M the mo;mni. in l ufiirc ihi ».s\i Ksiohs! S>%icm t -r MoHIc ( orrtiin:nic2:ior.i»i 

(jSM is expiCKd 10 tT.hanc< d * n • h\ :hf o* ii> suppi-^ri trjoNpuiem 
d^a iCi^ne* ^uih hi « khn ^ Thv nv\! sitjp nuv l!St**sn jH;i»h SpcctI Ciiciis? 

mw'arion SwiuJiifti^ iiv-^!;jt.u-» jiui sf t-\j>i\ii-w h* W hruh/cil in H/'iO pii»\idsi:i! a Jala 

ijic ot ibiAc kbii s ami ^u^ ba*»j'. aU . vt?vvU?p*.d fi^r liitnic u^aj;^\ Ot.»u'ivn^* a ^'iiiii 
arra OJ (*1 a*Mi t^ltvi* E*>*a *k'iav coiiipjrt\i to <.iS\l anJ nu) liu'jclorc he nt inter- 
cs! lor reahimw vutct* apphcatton*. uuhtri :r:trjiK»s A divad^ aruu-^r at DI:C I iiuv he 
thai ir cannoi hi- u^cJ U>r mob:lc applijauon.s ir^at Kn.r.t' ^vuh a ^ptcd h^bef ilian ^ 
knv h 

For mi)biU^ ^ kIco vivntimimcaiuMiS a kt", fac'oj in MPrt j-4 i> {Iw viihaiu'Cv* nKJi iv* 
l>usl«.r>s aiul tiic i\>v Mfuudus-v r^ual uhic^ Hie l-ba^r^l vod?n^ iir; 'lvv^iusci\; v a;* 
!^»vv> piui!isafi<in t>r t>bjec?s. c.t fnj ar trhji^i v.hulv u'^jiJUCii liii'li UiUilil) 
i fwrcgraund), a heticr lfait>RUssii>n chap.r.eJ aiid j lin:*:;?? hn rau' ^•*tii]J K? ^eiecic<1 
shan lot ofhtrt ohrccf.5 ibuck^snHinJ}. Ho'-m-vvi, ihiy icaUsu- iias lo bo ^urpijiied by s^ir 
mobile neuvork 

nwy gjovv ;n ilic ;u>iMn«>l>jlc ftvccsscd l«?it';7iri, as cu?Terit?v j\ pitiable v!;:ita nuulcrn^^ S<ii 
!*0 1'S t Pl^in? OKi 1 ph*'^ti%* ServK.^^ ? vu^ipiKi e!b:u/,h baiui'A'sdih r ^o.S khii si dyr ?ea^- 
'line ^t.jt'o ap|Mu'^iif:i^-; Ihv r.cxi ;.!tr.if amvi i>i hfi^h ^pi'*^d nindcjn>v ba.scd <»ri AliSL 
(AsyninuMnc I>K»u^i .Sybscrih<.'r 1 ir.i:M>r i:ib:i;!qucm iivruioloi^ic*. wji^h have lo he 
supported by the ph^-iiirit riC'wojk jjcc^a pj;n:Jvr n;ay piOMJi' ihe iVqiiucd bii:id^ 
wiiilh or up U» 1 Mbil ?» cirahlimj the hjjjikaM k^I % uieo applKati»)ri> 

Market push 

N«ii »?iilv v1i.>c^ c\cr> new ^u*p in ;i:vhjK^>>^> ciubk v\ciuui| nc^^ a^^i ko»i^^puia- 
lu^iulh iiiiuc iicrjiaiuhni; api^i:v'<JiiimN. bus \ | Si ?tvhpi?i<iji) als\» nccib 1ms c\t's> lu-vs 
dcvckipn^irsii su*p i^c\s inass ^pjiuc.'iiirirt^. ^.^hich v^%i a liipli tiaiiibcs' ^ihcon chip-, m 
n:ruiii iSie dcvcjupincni ir*vcsii:;ciii:. Hu- hi4*hc>; jciji^a of m^i-sunu-i: *xi'uis, -^^-hcii 
pi\>du':L5 uhicH aic ah'::ail nfih*.; marfcei le j: bccau-ii* «r hii:h pi'jibmiancc), ^ran b'j aohi 
u» a larj:c nuniK-r rif atsuimef; wish hirh prf«lil For L-xainplf. im nu* J iiinvpiTloi- 

t'veniijaity be too %nviil u> i*am ih*:- niorn?) iilreasK in^ c-^icd m ihc il£*\tfh>pmeiJi k«rthe 
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pnKcf-sOi o! Lit:: branj. 1 he need to finJ and p-Lsntotc thv^e new %elhn;j'* apphcaiif^ns 
s f^u'K? and inoi< nuiyt^iVM, c\ en nv\y iicp sihcon tcchn»tlo>iy hich- 

r)y 1 -i dcpulH iIk' pj»,v4..c\*5jOr pciioriujiuc c\oIunt>;i jiiJ sonic M St-V-'iliiiiJ" apph- 
-aao::.-! I oj cxampU*. Jtouiid h>K» Mscsobcfi uU:onc!> pjomoicd liu'ir uraphic:]! 



IF ■ 

4 ' 



P£),VCf PC 



:fc K%\ ii- 



o i::!SC 



i — J — \ t > 



^084 ^<i88 l&jl' iGij^> 200C 2CC^ |i\ 0 l>it'!Mi \tr-*ah% :> ^;. 

i:omp3>ut!fm:»( pnw^r i>r sia?e-i'^lHh€-3n ^, '.vhu'li have K'cortiv jj^udabUr ior Ihc 
rru^i^-rnarkoi. stc ^iiil:cicnt ibr u pjcal applications like out pn>;Ct:»>?fu:. hticmei. ci^ 
1 hi^ rcs^iu into dn^ppm^ dcinsnd:^ ibi ruuh f>f rio:ir!:inrr rjH 's well ^> in mttppm^ 

imi or X :^ ' h N 2iil ' I V ii>r!iw:j cfsmcs iiurs Uu' iMnpt? r»i the pivicc^sin*! vopabiliJio- 
iii ihir viis:c«r liii?h'|>cf ioisnajiLc CI*U ^'v.ncr;tuo;:, C5iHCull> hciv* ^pccul muluntcdia 
pioct:i*or msinivut>ti>3fc avateabtc I ncictoiv. processor nwruiraciuK^SCuisctiUypio 
nu:K* viikfi a ppl I. T 3 rjcm 5 <c y by inx iufiuja* dis:nbur:or>i. cvcrsu* ,i dcwn^iia ^oi 

a\;'j!l3l>"e IS ntii Slit ficicn; for rcai'iiW appiu auoni b^'yond C H* (couini-ori inici- 
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Algof ithnns, Complexity Analysis 
and VLSI Architectures for 
MPEG-4 Motion Estimation 



Peter Kuhn 

MP€G^ is ihc mulclme<fi» sandard to combtne thceractivrty. natural and synthetic 
digital vid«a;iMdio and cppvp/q^ ff^}^$i^Tfjpk^ applkaOoDS are: Inccmec video 
<on(liV«rii£ir^ tfidihOe vldcc^ihbnc. multbnedb cooperative work, ceiemchiitg and 
games. Wlih liPEG-4 che next seep frorh blode-based video (ISO/lEC MPEG-I. 

CCiTT H.261. ITU*T H,263) to art^fG^rily-shaped visual objecu Is taken. 
This sigmHcant step dofn.^nds a new meUiodOkig/ fc^^ analysis and desi^o to 

meet the cons<dc:*at>ly hi^er Hexibtiicy of MP&G^,: 



riociOf^ estxi^ation ts a central part of MPEG-1/2/4 and the H.261/H.263 video com- 
pression standards and Has attracted nmcb' attention tn research and induttry.\fdir the 
fblowj^^ reasons: k is computatiofwiWy che mos( demandaig al^rlihm cf> vfcfeb ^j^^^ 
der (l)b&f^ 6(Vr8^ comp^tion time).k his a hi^ mip:^ on dw; vlii»al <)ua^ 



jEyjif^ai^i^ko encoder, arid. U is not'scai);;^^ open lo j^m^^^tii^pn. 



- Fak motion esVrmat^n algonthms 

0^|Be«ijled complexity anafysls of a software fn^jIcn^pgiS^iMMPEG video 
Ctomplexity and visual <^Ucy analyst of bst mo^»!^^^ddi|:i^gpnthms wichin 

- Design space on motion esctmation VtSf ^^^^^^^ 

- Detiined VLSI design e^camples of 1.) a ^^^^«^^^f^i^ ^ ) a low-power 

^"s'i^ ■■vr:-^^PCG-4 n^otion estimator ■40^^^^^^^'^''' 4 
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Introduction 

1 Purpose 

This Part of this specification was developed in response to the growing need for a generic coding method 
of moving pictures and of associated somid for various applications such as digital storage media, 
television broadcasting and communication. The use of this specification means that motion video can be 
manipulated as a form of computer data and can be stored on various storage media, transmitted and 
received over existing and fiiture networks and distributed on existing and fiiture broadcasting channels. 

2 Application 

The applications of this specification cover, but are not limited to, such areas as listed below: 

BSS Broadcasting Satellite Service (to the home) 

CATV Cable TV Distribution on optical networks, copper, etc. 

CDAD Cable Digital Audio EMstribution 

DSB Digital Sound Broadcasting (terrestrial and satellite broadcasting) 

DTTB Digital Terrestrial Television Broadcasting 

EC Electronic Cinema 

ENG Electronic News Gathering (including SNG, Satellite News Gathering) 

FSS Fixed Satellite Service (e.g. to head ends) 

HTT Home Television Theatre 

IPC Interpersonal Communications (videoconferencing, videophone, etc.) 

ISM Interactive Storage Media (optical disks, etc.) 

MMM Multimedia Mailing 

NCA News and Current Afiairs 

NDB Networked Database Services (via ATM, etc.) 

RVS Remote Video Surveillance 

SSM Serial Storage Media (digital VTR, etc.) 

3 Profiles and levels 

This specification is intended to be generic in the sense that it serves a wide range of applications, bitrates, 
resolutions, qualities and services. Applications should cover, among other things, digital storage media, 
television broadcasting and communications. In the course of creating this specification, various 
requirements fi'om typical applications have been considered, necessary algorithmic elements have been 
developed, and they have been integrated into a single syntax. Hence diis specification will &cilitate the 
bitstream interchange among different applications. 

Considering the practicality of implementing the fiill syntax of this specification, however, a limited 
number of subsets of the syntax are also stipulated by means of "profile** and level". These and o&er 
related terms are formally defined in clause 3 of this specification. 
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A **profile" is a defined subset of the entire bitstream syntax that is defined by this specification. Within 
the bounds imposed by the syntax of a given profile it is still possible to require a very large variation in 
the perfiamance of encoders and decoders depending upon the values takoi by parameters in the 

bitstream. For instance it is possible to specify fi^me sizes as large as (approximately) 2^^ samples wide 

by 2^^ lines high. It is currently neither practical nor economic to implement a decoder capable of 
dealing with all possible fi'ame sizes. 

In order to deal with this problem "levels" are defined within each profile. A level is a defined set of 
constraints imposed on parameto^ in the bitstream. These constraints may be simple limits on nimibers. 
Alternatively they may take the finm of constraints on arithmetic combinations of the parameters (e.g. 
fi^e width multiplied by fi^e height multiplied by fiame rate). 

Bitstreams complying with this specification use a conmion syntax. In OTder to achieve a subset of the 
complete syntax flags and parameters are included in the bitstream that signal the presence or otherwise 
of syntactic elements that occur later in the bitstream. In order to specify constraints on the syntax (and 
hence define a profile) it is thus only necessary to constrain the values of these flags and parameters that 
specify the presence of later syntactic elements. 

4 The scalable and the non-scalable syntax 

The full syntax can be divided into two major categories: One is the non-scalable syntax, ^ich is 
structured as a super set of the syntax defined in ISO/IEC 1 1 172-2. The main feature of the non-scalable 
syntax is the extra compression tools for interlaced video signals. The second is the scalable syntax, the 
key property of Mdiich is to enable the reconstruction of useful video from pieces of a total bitstream. This 
is achieved by structuring the total bitstream in two or more layers, starting from a standalone base layer 
and adding a number of enhancement layers. The base layer can use the non-scalable syntax, or in some 
situations conform to the ISO/IEC 1 1 172-2 syntax. 

4.1 Overview of the non-scalable syntax 

The coded representation defined in the non-scalable syntax achieves a high compression ratio \^le 
preserving good image quality. The algorithm is not lossless as the exact sample values are not preserved 
during coding. Obtaining good image quality at the bitrates of interest demands very high compression, 
which is not achievable with intra picture coding alone. The need for random access, however, is best 
satisfied with pure intra picture coding. The choice of the techniques is based on the need to balance a 
high image quality and compression ratio with the requirement to make random access to the coded 
bitstream. 

A number of techniques are used to achieve high compression. The algorithm first uses block-based 
motion compensation to reduce the temporal redundancy. Motion compensation is used both for causal 
prediction of the current picture from a previous picture, and for non-causal, interpolative prediction from 
past and future pictures. Motion vectors are defined for each 16-sample by 16-line region of the picture. 
The prediction error, is further compressed using the discrete cosine transform (DCT) to remove spatial 
correlation before it is quantised in an irreversible process that discards the less important information. 
Finally, the motion vectors are combined with the quantised DCT information, and encoded using 
variable length codes. 

4.1.1 Temporal processing 

Because of the conflicting requirements of random access and highly efficient compression, three main 
picture types are defined. Intra coded pictures (I-Pictures) are coded without reference to other pictures. 
They provide access points to the coded sequence ^ere decoding can begin, but are coded with only 
moderate compression. Predictive coded pictures (P-Pictures) are coded more efficientiy using motion 
compensated prediction from a past intra or predictive coded picture and are generally used as a reference 
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for further prediction. BidirectionaUy-predictive coded pictures (B-Pictures) provide the highest degree of 
compression but require both past and future reference pictures for motion compensation. Bidirectionally- 
predictive coded pictures are never used as references for prediction (except in the case that the resulting 
picture is used as a reference in a spatially scalable enhancemoit layer). The organisation of the three 
picture types in a sequence is very flexible. The choice is left to the encoder and will depend on the 
requirements of the application. Figure I-l illustrates an example of the relationship among the three 
different picture types. 




Figure 1 Example of temporal picture structure 



4AJ2 Coding interlaced video 

Each frame of interlaced video consists of two fields which are separated by one field-period. The 
specification allows either the frame to be encoded as picture or the two fields to be encoded as two 
pictures. Frame encoding or field encoding can be adaptively selected on a frame-by-frame basis. Frame 
encoding is typically preferred >^en the video scene contains significant detail with limited motion. Field 
encoding, in vMch the second field can be predicted from the first, works better when there is fast 
movement. 

4,13 Motion representation - macroblocks 

As in ISO/IEC 1 1 172-2, the choice of 16 by 16 macroblocks for the motion-compensation imit is a result 
of the trade-offbetwe^ the coding gain provided by using motion information and the overhead needed to 
represent it. Each macroblock can be temporally predicted in one of a number of dififerent ways. For 
example, in frame encoding, the prediction from the previous reference frame can itself be either fi^me- 
based or field-based. Depending on the type of the macroblock, motion vector information and other side 
information is encoded with the compressed prediction error in each macroblock. The motion vectors are 
encoded dififerentially with respect to the last encoded motion vectors using variable length codes. The 
maximum length of the motion vectors that may be represented can be programmed, on a picture-by- 
picture basis, so that the most demanding applications can be met without compromising the performance 
of the system in more normal situations. 

It is the responsibility of the encoder to calculate ^propriate motion vectors. The specification does not 
specify how this should be done. 
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4.1 .4 Spatial redundancy reduction 

Both source pictures and prediction errors have high spatial redundancy. This specification uses a block- 
based DCT method with visually weighted qiiantisation and nm-length coding. A^er motion compensated 
prediction or interpolation, the resulting prediction error is split into 8 by 8 blodcs. These are transformed 
into the £>CT domain ^ere they are weighted before being quantised. After quantisation many of the 
DCT coefScients are zero in value and so two-dimensional run-length and variable length coding is used 
to encode the remaining DCT coefiBcients efficiently. 

4.1.5 Chrominance formats 

In addition to the 4:2:0 format supported in ISO/IEC 1 1 172-2 this specification supports 4:2:2 and 4:4:4 
chrominance formats. 

4.2 Scalable extensions 

The scalability tools in this specification are designed to support applications beyond that supported by 
single layer video. Among the noteworthy applications areas addressed are video telecommunications, 
video on asynchronous transfer mode networks (ATM), interworking of video standards, video service 
hierarchies with multiple spatial, temporal and quality resolutions, HDTV with embedded TV, systems 
allowing migration to higher temporal resolution HDTV etc. Although a simple solution to scalable video 
is the simulcast technique which is based on transmission/storage of multiple independendy coded 
reproductions of video, a more efficient alternative is scalable video coding, in which the bandwidth 
allocated to a given reproduction of video can be partially re-utilised in coding of the next reproduction of 
video. In scalable video coding, it is assumed that given a coded bitstream, decoders of various 
complexities can decode and display appropriate reproductions of coded video. A scalable video encoder 
is likely to have increased complexity when compared to a single layer encoder. However, this standard 
provides several difierent forms of scalabilities that address non-overlapping applications with 
corresponding complexities. The basic scalability tools offered are: data partitioning, SNR scalability, 
spatial scalability and tenyjoral scalability. Moreover, combinations of these basic scalability tools are 
also supported and are referred to as hybrid scalability. In the case of basic scalability, two layers of video 
referred to as the lower layer and the enhancement layer are allowed, ^ereas in hybrid scalability up to 
three layers are supported. The following Tables provide a few example applications of various 
scalabilities. 



Table 1 Applications of SNR scalability 



Lower layer 


Enhancement layer 


Application 


Recommendation 
ITU-R BT.601 

High Definition 
4:2:0 High Definition 


Same resolution and format 
as lower layer 

Same resolution and format 
as lower layer 

4:2:2 chroma simulcast 


Two quality service for Standard TV 
(SDTV) 

Two quality service for HDTV 
Video production / distribution 
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Table 2 Applications of spatial scalability 



Base 


Enhancement 


Application 


progressive(30Hz) 
mterlace(30Hz) 
progressive(30Hz) 
interlace(30Hz) 


progressive(30Hz) 
mterlace(30Hz) 
mterlace(30Hz) 
progressive(60Hz) 


HDTV/SDTV scalability 

ISO/IEC 11 172-2/compatibility with this specification 
Migration to high resolution progressive HDTV 



Table 3. Applications of temporal scalability 



Base 


Enhancement 


Higher 


Application 


progressive(30Hz) 
interlace(30Hz) 


pr()gressive(30Hz) 
interlace(30Hz) 


progressive (60Hz) 
progressive (60Hz) 


Migration to high resolution 
progressive HDTV 

Migration to high resolution 
progressive HDTV 



4.2.1 Spatial scalable extension 

Spatial scalability is a tool intended for use in video applications involving telecommunications, 
interworking of video standards, video database browsing, interworking of HDTV and TV etc., i.e., video 
systems with the primary conmion feature that a minimum of two layers of spatial resolution are 
necessary. Spatial scalability involves generating two spatial resolution video layers from a single video 
source such that the lower layer is coded by itself to provide the basic spatial resolution and the 
enhancemrat layer employs the spatially interpolated lower layer and carries the full spatial resolution of 
the input video source. The lower and the enhancement layers may either both use the coding tools in this 
specification, or the ISO/IEC 11172-2 standard for the lower layer and this specification for the 
eohancement layer. The latter case achieves a further advantage by &cilitating interworking between 
video coding standards. Moreover, spatial scalability ofifers flexibility in choice of video formats to be 
employed in each layer. An additional advantage of spatial scalability is its ability to provide resilience to 
transmission errors as the more important data of the Iowct layer can be sent over channel with better 
error performance, \^ile the less critical enhancement layer data can be sent over a channel with poor 
error performance. 

4.2.2 SNR scalable extension 

SNR scalability is a tool intended for use in video applications involving telecommunications, video 
services with multiple qualities, standard TV and HDTV, i.e., video systems with the primary common 
feature that a minimum of two layers of video quality are necessary. SNR scalability involves generating 
two video layers of same spatial resolution but different video qualities fi*om a single video source such 
that the lower layer is coded by itself to provide the basic video quality and the enhancement layer is 
coded to enhance the lower layer. The enhancement layer when added back to the lower layer regenerates 
a higher quality reproduction of the input video. The lower and the enhancement layers may either use 
this specification or ISO/IEC 11172-2 standard for the lower layer and this specification for the 
enhancement layer. An additional advantage of SNR scalability is its ability to provide high degree of 
resilience to transmission errors as the more important data of the lower layer can be sent over channel 
with better error performance, ^le the less critical enhancement layer data can be sent over a channel 
with poor error performance. 
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4J23 Tenqraral scalable extension 

Temporal scalability is a tool intended for use in a range of diverse video applications from 
telecommxmications to HDTV for >^ch migration to higher temporal resolution systems from that of 
lower temporal resolution systems may be necessary. In many cases, the lower temporal resolution video 
systems may be either the existing systems or the less expensive early generation systems, with the 
motivation of introducing more sophisticated systems gradually. Temporal scalability involves 
partitioning of video frames into layers, >^ereas the lower layer is coded by itself to provide the basic 
temporal rate and the enhancement layer is coded with temporal prediction with respect to the lower layer, 
these layers when decoded and temporal multiplexed to yield fiill temporal resolution of the video source. 
The lower temporal resolution systems may only decode the lower layer to provide basic temporal 
resolution, Mii^eas more sophisticated systems of the future may decode both layers and provide high 
temporal resolution video while maintaining interworking with earlier generation systems. An additional 
advantage of temporal scalability is its ability to provide resilience to transmission errors as the more 
important data of the lower layer can be sent over channel with better error performance, v^ile the less 
critical enhancement layer can be sent over a channel with poor error performance. 

4.2.4 Data partitioning extension 

Data partitioning is a tool intended for use vAicn two channels are available for transmission and/or 
storage of a video bitstream, as may be the case in ATM networks, terrestrial broadcast, magnetic media, 
etc. The bitstream is partitioned between these channels sudi that more critical parts of the bitstream 
(such as headers, motion vectors, low frequency DCT coefficients) are transmitted in the channel with the 
better error performance, and less critical data (such as higher frequency DCT coefficients) is transmitted 
in the channel with poor error performance. Thus, degradation to channel errors are minimised since the 
critical parts of a bitstream are better protected Data from neith^ channel may be decoded on a decoder 
that is not intended for decoding data partitioned bitstreams. 
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INTERNATIONAL STANDARD 13818-2 



RECOMMENDATION ITU-T H.262 




1 



Scope 



This Recommendation | International Standard specifies the coded representation of picture information 
for digital storage media and digital video communication and specifies the decoding process. The 
represoitation supports constant bitrate transmission, variable bitrate transmission, random access, 
channel hopping, scalable decoding, bitstream editing, as v/gW as special functions such as fiist forward 
playback, fkst reverse playback, slow motion, pause and still pictures. This Recommendation | 
International Standard is forward compatible with ISO/IEC 1 11 72-2 and upward or downward compatible 
with EDTV, HDTV, SDTV formats. 

This Recommendation | International Standard is primarily applicable to digital storage media, video 
broadcast and communication. The storage media may be directly connected to the decoder, or via 
communications means such as busses, LANs, or telecommunications links. 



The following ITU-T Recommendations and International Standards contain provisions which through 
reference in this text, constitute provisions of this Recommendation | International Standard. At the time 
of publication, the editions indicated were valid. All Recommendations and Standards are subject to 
revision, and parties to agreements based on this Recommendation | International Standard are 
encouraged to investigate the possibility of applying the most recent editions of the standards indicated 
below. Members of TEC and ISO maintain registers of currentiy valid International Standards. The 
Telecommimication Standardisation Bureau maintains a list of currently valid ITU-T Reconmiendations. 
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Normative references 
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RecomniendatiQiis and rejpoits of the CCIR, 1990 XVIIth PleQary Assembly, Dusseldorf, 1990 
Volume XI - Part 1 Broadcasting Service (Television) Reconmiendation ITU-RBT.601-3 
"^Encoding parameters of digital television for studios". 

CCIR Volume X and XI Part 3 Recommendation ITU-R BR.648 ''Recording of audio signals". 

CCIR Volume X and XI Part 3 Report ITU-R 955-2 "Satellite sound broadcasting to vehicular, 
portable and fixed receivers in the range 500 - 3000Mhz". 

ISO/IEC 11172-1 1993, Information technology — Coding of moving pictures and associated 
audio for digital storage media at up to about 1,5 Mbit/s — Part 1: Systems, 

ISO/IEC 11172-2 1993, Information technology — Coding of moving pictures and associated 
audio for digital storage media at up to about 1,5 Mbit/s — Part 2: Video, 

ISO/IEC 11172-3 1993, Information technology — Coding of moving pictures and associated 
audio for digital storage media at up to about 1,5 Mbit/s — Part 3: Audio, 

IEEE Standard Specifications for the Implementations of 8 by 8 Inverse Discrete Cosine 
Transform, IEEE Std 1 180-1990, December 6, 1990. 

lEC Publication 908:1987, CD Digital Audio System, 

TEC Publication 46 1 : 1 986, Time and control code for video tape recorder. 



ITU-T Recommendation R261 (Formerly CCITT Recommendation H.261) Codes for 
audiovisual services at px64 kbit/s Geneva, 1990. 

ISO/IEC 10918-1:1994 | Recommendation ITU-T T.81 (JPEG) Information Technology — 
Digital compression and coding of continuous-tone still images: Requirements and guidelines. 
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For the purposes of this Recommendatioii | Intematioiial Standard, the following definitions apply. 

3.1 AC coefficient: Any DCT coefBcient for which the frequency in one or both dimensions is 
non-zero. 

3.2 big picture: A coded picture that would cause VBV buffer underflow as defined in C.7 
Annex C. Big pictures can only occur in sequences \^iiere low_delay is equal to 1. "Skipped 
picture" is a term that is sometimes used to describe the same concept 

33 B-field picture: A field structure B-Picture. 

3.4 B-frame picture: A frame structure B-Picture. 

3.5 B-picture; bidirectionally predictive-coded picture: A picture that is coded using motion 
compensated prediction from past and/or future reference fields or frames. 

3.6 backward compatibility: A newer coding standard is backward compatible with an old^ 
coding standard if decoders designed to operate with the older coding standard are able to 
continue to operate by decoding all or part of a bitstream produced according to the newer 
coding standard. 

3.7 baclcward motion vector: A motion vector that is used for motion compensation from a 
reference fi^e or reference field at a later time in display order. 

3.8 backward prediction: Prediction from the future reference frame (field). 

3.9 base layer: First, independently decodable layer of a scalable hierarchy 

3.10 bitstream; stream: A ordered series of bits that forms the coded rq}resentation of the data. 

3.11 bitrate: The rate at which the coded bitstream is delivered from the storage medium to the 
input of a decoder. 

3.12 block: An 8-row by 8-column matrix of samples, or 64 DCT coefiScients (source, quantised 
or dequantised). 

3.13 bottom field: One of two fields that comprise a frame. Each line of a bottom field is spatially 
located immediately below the corresponding line of the top field. 

3.14 byte aligned: A bit in a coded bitstream is byte-aligned if its position is a multiple of 8-bits 
from the first bit in the stream. 

3.15 byte: Sequence of 8-bits. 

3.16 channel: A digital medium that stores or transports a bitstream constructed according to this 
specification. 

3.17 chrominance format: Defines the number of chrominance blocks in a macroblock. 

3.18 chroma simulcast: A type of scalability (\^ch is a subset of SNR scalability) \;^ere the 
enhancement layer (s) contain only coded refinement data for the DC coefficients, and all 
the data for the AC coefficients, of the chrominance components. 

3.19 chrominance component: A matrix, block or single sample representing one of the two 
colour difference signals related to the primary colours in the manner defined in the 
bitstream. The symbols used for the chrominance signals are Cr and Cb. 

3.20 coded B-fi^ame: A B-frame picture or a pair of B-field pictures. 
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3.21 coded frame: A coded frame is a coded I-frame, a coded P-frame or a coded B-frame. 

3.22 coded I-frame: An I-frame picture or a pair of field pictures, where the first field picture is 
an I-picture and the second field picture is an I-picture or a P-picture. 

3.23 coded P-frame: A P-fi^ame picture or a pair of P-field pictures. 

3.24 coded picture: A coded pictin*e is made of a picture header, the optional extensions 
immediately following it, and the following picture data. A coded picture may be a coded 
fi^une or a coded field. 

3.25 coded video bitstream: A coded representation of a series of one or more pictures as defined 
in this specification. 

3.26 coded order: The order in which the pictures are transmitted and decoded. This order is not 
necessarily the same as the display order. 

3.27 coded representation: A data element as represented in its encoded form. 

3.28 coding parameters: The set of user-definable parameters that characterise a coded video 
bitstream. Bitstreams are characterised by coding parameters. Decoders are characterised by 
the bitstreams that th^ are capable of decoding. 

3.29 component: A matrix, block or single sample from one of the three matrices (luminance and 
two chrominance) that make up a picture. 

330 compression: Reduction in the number of bits used to represent an item of data. 

331 constant bitrate coded video: A coded video bitstream with a constant bitrate. 

332 constant bitrate: Operation ^ere the bitrate is constant &am start to finish of the coded 
bitstream. 

333 data element: An item of data as represented before encoding and after decoding. 

334 data partitioning: A method for dividing a bitstream into two separate bitstreams for error 
resilience purposes, the two bitstreams have to be recombined before decoding. 

335 D-Picture: A type of picture that shall not be used ^cept in ISO/IEC 1 1 1 72-2. 

336 DC coefficient: The DCT coefficient for vMch the frequency is zero in both dimensions. 

337 DCT coefficient: The amplitude of a specific cosine basis fimction. 

338 decoder input buffer: The first-in first-out (FIFO) buffer specified in the video buffering 
verifier. 

339 decoder: An embodiment of a decoding process. 

3.40 decoding (process): The process defined in this specification that reads an input coded 
bitstream and produces decoded pictures or audio samples. 

3.41 dequantisation: The process of rescaling the quantised DCT coefficients after their 
representaticm in the bitstream has been decoded and before they are presented to the inverse 
DCT. 

3.42 digital storage media; DSM: A digital storage or transmission device or system. 

3.43 discrete cosine transform; DCT: Either the forward discrete cosine transform or the inverse 
discrete cosine transfinm. The DCT is an invertible, discrete orthogonal transformation. 
The inverse DCT is defined in Annex A of this specification. 

3.44 display aspect ratio: The ratio height/width (in SI units) of the intended display. 
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3.45 display order: The order in which the decoded pictures are displayed. Normally this is the 
same order in which they were presented at the input of the encode. 

3.46 display process: The (non>normative) process by \^ch reconstructed frames are displayed. 

3.47 dual-prime prediction: A prediction mode in ^^ch two forward field-based predictions are 
averaged. The predicted block size is 16x16 luminance samples. Dual-prime prediction is 
only used in interlaced P-pictures. 

3.48 editing: The process by ^ich one or more coded bitstreams are manipulated to produce a 
new coded bitstream. Conforming edited bitstreams must meet the requirements defined in 
this specification. 

3.49 encoder: An embodiment of an encoding process. 

3.50 encoding (process): A process, not specified in this specification, that reads a stream of 
input pictures or audio samples and produces a valid coded bitstream as defined in this 
specification. 

3.51 enhancement layer: A relative reference to a layer (above the base layer) in a scalable 

hierarchy. For all forms of scalability, its decoding process can be described by reference to 
the lower layer decoding process and the appropriate additional decoding process for the 
enhanconent layer itself. 

3.52 fast forward playback: The process of displaying a sequCTce, or parts of a sequence, of 
pictures in display-order faster than real-time. 

3.53 fast reverse playback: The process of displaying the picture sequence in the reverse of 
display order faster than real-time. 

3.54 field: For an interlaced video signal, a '"field" is the assembly of alternate lines of a fi'ame. 
Therefore an interlaced frame is composed of two fields, a top field and a bottom field. 

3.55 field-based prediction: A prediction mode using only one field of the reference frame. The 
predicted block size is 16x16 luminance samples. Field-based prediction is not used in 
progressive frames. 

3.56 field period: The reciprocal of twice the frame rate. 

3.57 field picture; field structure picture: A field structure picture is a coded picture with 
picture_structure is equal to 'Top field" or "Bottom field". 

3.58 flag: A one bit integer variable which may take one of only two values (zero and one). 

3.59 forbidden: The term 'forbidden" when used in the clauses defining the coded bitstream 
indicates that the value shall never be used. This is usually to avoid emulation of start codes. 

3.60 forced iq>dating: The process by which macroblocks are intra-coded from time-to-time to 
ensure that mismatch errors betwreen the inverse DCT processes in encoders and decoders 
cannot build up excessively. 

3.61 forward compatibility: A newer coding standard is forward compatible with an older 
coding standard if decoders designed to operate with the newer coding standard are able to 
decode bitstreams of the older coding standard. 

3.62 forward motion vector: A motion vector that is used for motion compensation from a 
reference firame or reference field at an earlier time in display order. 

3.63 forward prediction: Prediction from the past reference frame (field). 



Recommendation ITU-T H.262 (1995 £) 



ISO/IEC 13818-2: 1995 (E) 



3.64 



3.65 
3.66 
3.67 

3.68 
3.69 

3.70 



3.71 



3.72 



3.73 



3.74 



3.75 
3.76 
3.77 
3.78 

3.79 



3.80 



3.81 



3.82 



frame: A frame contams lines of spatial information of a video signal. For progressive video, 
these lines contain samples starting from one time instant and continuing through successive 
lines to the bottom of the frame. For interlaced video a frame consists of two fields^ a top 
field and a bottom field. One of these fields will commence one field period later than the 
other. 



frame-based prediction: A prediction mode using both fields of the reference firame. 
frame period: The reciprocal of the fiamerate. 

frame picture; frame structure picture: A fi^me structure picture is a coded picture with 
picture_structure is equal to 'Trame". 

frame rate: The rate at which frames are be output from the decoding process. 

future reference frame (field): A fiiture reference fiame(field) is a reference frame(field) 
that occurs at a later time than the current picture in display ordCT. 

frame reordering: The process of reordering the reconstructed fi'ames ^en the coded 
order is different from the display order. Frame reordering occurs when B-fi:ames are 
present in a bitstream. There is no frame reordering when decoding low delay bitstreams. 

groiqj of pictures: A notion defined only in ISO/IEC 11172-2 (MPEG-1 Video). In this 
specification, a similar fimctionality can be achieved by the mean of inserting group of 
pictures headers. 

header: A block of data in the coded bitstream containing the coded rq)resentation of a 
number of data elements pertaining to the coded data that follow the header in the bitstream. 

hybrid scalability: Itybrid scalability is the combination of two (or more) types of 
scalability. 

interlace: The property of conventional television frames where alternating lines of the 
frame represent different instances in time. In an interlaced frame, one of the field is meant 
to be displayed first This field is called the first field. The first field can be the top field or 
the bottom field of the fi^me. 

I-fleld picture: A field structure I-Picture. 

I-frame picture: A fi'ame structure I-Picture. 

I-picture; intra-coded picture: A picture coded using information only fixnn itself 

intra coding: Coding of a macroblock or picture that uses information only firom that 
macroblock or picture. 

level: A defined set of constraints on the values which may be taken by the parameters of this 
specification within a particular profile. A profile may contain one or more levels. In a 
different context, level is the absolute value of a non-zero coefficient (see '*run"). 

layer: In a scalable hierarchy denotes one out of the ordered set of bitstreams and (the result 
of) its associated decoding process (implicitiy including decoding of all layers below this 
layer). 

layer bitstream: A single bitstream associated to a specific layer (always used in 
conjunction with layer qualifiers, cfg. "enhancement layer bitstream") 



lower layer: A relative reference to the layer inunediately below a given enhancement layer 
(implicitiy including decoding of all layers below this enhancement layer) 
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3.83 huninance component: A matrix, block or single sample rq)reseating a monochrome 
representation of the signal and related to the primary colours in the manner defined in the 
bitstream. The symbol used for luminance is Y. 

3.84 Mbit: 1 000 000 bits 

3.85 macroblock: The four 8 by 8 blocks of luminance data and the two (for 4:2:0 chrominance 
format), four (for 4:2:2 chrominance format) or eight (for 4:4:4 chrominanoe format) 
corresponding 8 by 8 blocks of chrominance data coming fi^m a 16 by 16 section of the 
luminance component of the picture. Macroblock is sometimes used to refer to the sample 
data and sometimes to the coded representation of the sample values and other data elemoits 
defined in the macroblock header of the syntax defined in this part of this specification. The 
usage is clear fi-om the context. 

3.86 motion compensation: The use of motion vectors to improve the efficiency of the prediction 
of sample values. The prediction uses motion vectors to provide ofi&ets into the past and/or 
foture reference fi^mes or reference fields containing previously decoded sample values that 
are used to form the prediction error. 

3.87 motion estimation: The process of estimating motion vectors during the encoding process. 

3.88 motion vector: A two-dimensional vector used for motion compensation that provides an 
ofiset firom the coordinate position in the current picture or field to the coordinates in a 
reference fi'ame or reference field 

3.89 non-intra coding: Coding of a macroblock or picture that uses information both fi'om itself 
and fi'om macroblocks and pictures occurring at other times. 

3.90 opposite parity: The opposite parity of top is bottom, and vice versa. 

3.91 P-field picture: A field structure P-Picture. 

3.92 P-frame picture: A frame structure P-Picture. 

3.93 P-picture; predictive-coded picture: A picture that is coded using motion compensated 
prediction from past reference fields or fi'ame. 

3.94 parameter: A variable within the syntax of this specification which may take one of a range 
of values. A variable which can take one of only two values is called a flag. 

3.95 parity (of field): The parity of a field can be top or bottom. 

3.96 past reference frame (field): A past reference frame(field) is a reference fi:ame(field) that 
occurs at an earlier time than the current picture in display order. 

3.97 picture: Source, coded or reconstructed image data. A source or reconstructed picture 
consists of three rectangular matrices of 8-bit numbers representing the luminance and two 
chrominance signals. A "coded picture" is defined in 3.21. For progressive video, a picture is 
identical to a frame, while for interlaced video, a picture can refer to a frame, or the top field 
or the bottom field of the frame depending on the context. 

3.98 picture data: In the VBV (q)eratiQns, picture data is defined as all the bits of the coded 
picture, all the header(s) and user data immediately preceding it if any (including any 
stuffing between them) and all the stuffing following it, up to (but not including) the next 
start code, except in the case where the next start code is an ^d of sequence code, in which 
case it is included in the picture data. 

3.99 prediction: The use of a predictor to provide an estimate of the sample value or data element 
currently being decoded. 
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3.100 prediction error: The difference between the actual value of a sample or data elonent and 
its predictor. 

3.101 predictor: A linear combination of previously decoded sample values or data elements. 

3.102 profile: A defined subset of the syntax of this specification. 

NOTE - In this specification the word '*profile" is used as defined above. It should not be confiised 
with other definitions of ""profile*' and in particular it does not have the meaning that is 
defined by JTCl/SGFS. 

3.103 progressive: The property of film fi^mes vfh&ce all the samples of the fi'ame represent the 
same instances in time. 

3.104 quantisation matrix: A set of sixty-four 8-bit values used by the dequantiser. 

3.105 quantised DCT coefficients: DCT coefficients before dequantisation. A variable length 
coded representation of quantised DCT coefiBdents is transmitted as part of the coded video 
bitstream. 

3.106 quantiser scale: A scale factor coded in the bitstream and used by the decoding process to 
scale the dequantisation. 

3.107 random access: The process of beginning to read and decode the coded bitstream at an 
arbitrary point. 

3.108 reconstructed frame: A reconstructed frame consists of three rectangular matrices of 8-bit 
numbers representing the liuninance and two chrominance signals. A reconstructed fi^ime is 
obtained by decoding a coded fi^e. 

3.109 reconstructed picture: A reconstructed picture is obtained by decoding a coded picture. A 
reconstructed picture is either a reconstructed frame (when decoding a frame picture), or one 
field of a reconstructed frame (when decoding a field picture). If the coded picture is a field 
picture, then the reconstructed picture is the top field or the bottom field of the reconstructed 
frame. 

3.110 reference field: A reference field is one field of a reconstructed fi^me. Reference fields are 
used for forward and baclcward prediction v^en P-pictures and B-pictures are decoded Note 
that >^en field P-pictures are decoded, prediction of the second field P-picture of a coded 
fi^me uses the first reconstructed field of the same coded frame as a reference field. 

3.111 reference frame: A reference frame is a reconstructed frame that was coded in the form of 
a coded I-frame or a coded P-fi^ame. Reference frames are used for forward and backward 
prediction when P-pictures and B-pictures are decoded 

3.112 reordering delay: A delay in the decoding process that is caiised by frame reordering. 

3.113 reserved: The term "Reserved" when used in the clauses defining the coded bitstream 
indicates that the value may be used in the future for ISO/DBC defined extensions. 

3.114 sample aspect ratio: (abbreviated to SAR). This specifies the relative distance between 
samples. It is defined (for the purposes of this specification) as the vertical displacement of 
the lines of luminance samples in a frame divided by the horizontal displacement of the 
luminance samples. Thus its units are (metres per line) (metres per sample) 

3.115 scalable hierarchy: coded video data consisting of an ordered set of more than one video 
bitstream. 
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3.116 scalability: Scalability is the ability of a decoder to decode an ordered set of bitstreams to 

produce a reconstructed sequence. Moreover, useful video is output when subsets are 
decoded The minimum subset that can thus be decoded is the first bitstream in the set 
^ch is called the base layer. Each of the other bitstreams in the set is called an 
enhancement layi^. When addressing a specific enhancement layer, 'lower layer" refer to 
the bitstream which precedes the enhancement layer. 

3.117 side information: Information in the bitstream necessary for controlling the decoder. 

3.118 16x8 prediction: A prediction mode similar to field-based prediction but where the predicted 
block size is 16x8 luminance samples. 

3.119 nm: The number of zero coefficioits preceding a non-zero coefficient, in the scan order. 
The absolute value of the non-zero coefiScient is called 'level". 

3.120 saturation: Limiting a value that exceeds a defined range by setting its value to the 
maximum or minimum of the range as appropriate. 

3.121 skipped macroblock: A macroblock for \s^ich no data is encoded. 

3.122 slice: A consecutive series of macroblocks >^ich are all located in the same horizontal row 
of macroblocks. 

3.123 SNR scalability: A type of scalability where the enhancement layer (s) contain only coded 
refinement data for the DCT coefficients of the lower layer. 

3.124 source; input: Term used to describe the video material or some of its attributes before 
encoding. 

3.125 spatial prediction: prediction derived fi-om a decoded fiame of the lower layer decoder used 
in spatial scalability 

3.126 spatial scalability: A type of scalability wiiere an enhancement layer also uses predictions 
from sample data derived from a lower layer without using motion vectors. The layers can 
have difieroit frame sizes, fi^e rates or chrominance formats 

3.127 start codes [system and video]: 32-bit codes embedded in that coded bitstream that are 
unique. They are used for several purposes including identifying some of the structures in 
the coding syntax. 

3.128 stuffing (bits); stuffing (bytes): Code-words that may be inserted into the coded bitstream 
that are discarded in the decoding process. Their purpose is to increase the bitrate of the 
stream which would otherwise be lower than the desired bitrate. 

3.129 temporal prediction: prediction derived from reference fiames or fields other than those 
defined as spatial prediction 

3.130 temporal scalability: A type of scalability where an enhancement layer also uses 
predictions from sample data derived from a lower layer using motion vectors. The layers 
have identical frame size, and dirominance formats, but can have different frame rates. 

3.131 top field: One of two fields that comprise a fiame. Each line of a top field is spatially located 
immediately above the corresponding line of the bottom field. 

3.132 top layer: the topmost layer (with the highest layer.id) of a scalable hierarchy 

3.133 variable bitrate: Operation v/herc the bitrate varies with time during the decoding of a 
coded bitstream. 
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3.134 variable length coding; VLC: A revorsible procedure far coding that assigns shorter code- 
words to frequent events and longer code-words to less frequent events. 

3.135 video buffering verifier; VBV: A hypothetical decoder that is conceptually connected to the 
output of the encoder. Its purpose is to provide a constraint on the variability of the data rate 
that an encoder or editing process may produce. 

3.136 video sequence: The highest syntactic structure of coded video bitstreams. It contains a 
soies of one or more coded frames. 

3.137 XXX profile decoder: decoder able to decode one or a scalable hierarchy of bitstreams of 
which the top layer conforms.to the specifications of the xxx profile (with xxx being any of 
the defined Profile names). 

3.138 xxx profile scalable hierarchy: set of bitstreams of which the top layer conforms to the 
specifications of the xxx profile. 

3.139 xxx profile bitstream: a bitstream of a scalable hierarchy with a profile indication 
corresponding to xxx. Note that this bitstream is only decodable together with all its lower 
layer bitstreams (unless it is a base layer bitstream). 

3.140 zigzag scanning order: A specific sequential ordering of the DCT coefficients from 
(^^proximately) the lowest spatial frequency to the highest. 
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4 Abbreviations and symbols 

The mathematical operators used to describe this specification are similar to those used in the C 
programming language. However, integer divisions with truncation and rounding are specifically defined. 
Numbering and coimting loops generally begin from zero. 

4.1 Arithmetic operators 

+ Addition. 

Subtraction (as a binary operator) or negation (as a unary operator). 
Increment. i.e. x++ is equivalent to jc = jc + 1 
Decrement, i.e. x^-+ is equivalent iox = x- 1 



X 



Multiplication. 
^ Power. 

/ Integer division with truncation of the result toward zero. For example, 7/4 and -7/-4 are 

truncated to 1 and -7/4 and 7/-4 are truncated to -1. 

// Integer division with rounding to the nearest integer. Half-integer values are rounded away 

from zero unless otherwise specified. For example 3//2 is rounded to 2, and -3//2 is rounded 
to -2. 

DIV Integer division with truncation of the result toward minus infinity. For ^cample 3 DIV 2 is 

rounded to 1, and -3 DIV 2 is rounded to -2. 

Used to denote division in mathematical equations ^ere no truncation or roimding is 
intended. 

% Modulus operator. Defined only for positive numbers. 

1 ;c>0 



Sign( ) Sign(jc) = 



i<b 



0 x==0 
-1 x<0 



►=0 

Abs( ) Abs(x) 



\ X X >= C 

" 1-jc jc < 0 



^ /(i) The summation of the^O ' taking integral values from a up to, but not including b. 



4.2 Logical operators 

Logical OR. 
&&, Logical AND. 

! Logical NOT. 

43 Relational operators 

> Greater than. 
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>= Greater than or equal to. 

< Less than. 

<= Less than or equal to. 

= Equal to. 

!= Not equal to. 

max [,...,] the maximum value in the argument list, 

min [, ... ,] the minimum value in the argument list. 

4.4 Bitwise operators 

& AND 

I OR 

» Shift right with sign extension. 

« Shift left with zero fill. 

4.5 Assignment 

= Assignment operator. 

4.6 Mnemonics 

The following mnemonics are defined to describe the difierent data types used in the coded bitstream. 

bslbf Bit string, left bit first, where "left" is the order in which bit strings are written in this 
specification. Bit strings are generally written as a string of Is and Os within single quote 
marks, e.g. '1000 OOOr. Blanks within a bit string are for ease of reading and have no 
significance. For convenience large strings are occasionally written in hexadecimal, in this 
case conversion to a binary in the conventional manner will yield the value of the bit string. 
Thus the left most hexadecimal digit is first and in each hexadecimal digit the most 
significant of the four bits is first. 

uimsbf Unsigned integer, most significant bit first. 

simsbf Signed integer, in twos complement format, most significant (sign) bit first 

vlclbf Variable length code, left bit first, where "left" refers to the order in Miiich the VLC codes 
are written. The byte order of multibyte words is most significant byte first. 

4.7 Constants 

It 3,141 592 653 58... 

e 2,718 281 828 45... 
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5 Conventions 

5.1 Method of describing bitstream syntax 

The bitstream retrieved by the decoder is described in 6.2. Each data item in the bitstream is in bold type. 
It is described by its name, its length in bits, and a mnemonic for its type and order of transmission. 

The action caused by a decoded data element in a bitstream depends on the value of that data element and 
on data elements previously decoded The decoding of the data elements and definition of the state 
variables used in their decoding are described in 6.3. The following constructs are used to express the 
conditions \^en data elements are present, and are in normal type: 



while ( condition ) { 
data.element 
• • • 

} 


If the condition is true, then the group of data elements 
occurs next in the data stream. This repeats until the 
condition is not true. 


do{ 

oata^eiement 
• • • 

} while ( condition ) 


1 ne oaia eiemeni always occurs ai leasi once. 

The data elemrat is repeated until the condition is not true. 


if ( condition ) { 
data_element 

• • • 
} else { 

data.element 

• • • 

} 


If the condition is true, then the first group of data 
elements occurs next in the data stream. 

If the condition is not true, then the second group of data 
elements occurs next in the data stream. 


for ( i = m; i < n; i-H-) { 
data.element 
• • • 

} 


The group of data elements occurs (m-n) times. Conditional 
constructs within the group of data elements may depend 
on the value of the loop control variable i, which is set to 
m for the first occurrence, incremented by one for 
the second occurrence, and so forth. 


/* comment ... */ 


Explanatory comment that may be deleted entirely without 
in any way altering the syntax. 



This syntax uses the 'C-code* convention that a variable or expression evaluating to a non-zero value is 
equivalent to a condition that is true and a variable or expression evaluating to a zero value is equivalent 
to a condition that is false. In many cases a literal string is used in a condition. For example; 

if ( scalable.mode = "spatial scalability" ) ... 

In such cases the literal string is that used to describe the value of the bitstream element in 6.3. In this 
example, we see that "spatial scalability" is defined in Table 6-10 to be represented by the two bit binary 
number '01'. 
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As noted, the gjroup of data elements may contain nested conditional constructs. For compactness, the {} 
are omitted when only one data element follows. 

data_element [n] data_element [n] is the n+ 1th element of an array of data. 

data.element lm][n] data_element [m][n] is the m+1, n+lth element of a two-dimensional array of 

data. 

data.element [l][m][n] data.element [l][m][n] is the 1+1, m+1, n+lth element of a three-dimensional 

array of data. 

While the syntax is expressed in procedural terms, it should not be assumed that 6.2 implements a 
satisfactory decoding procedure. In particular, it defines a correct and error-fi^ input bitstream. Actual 
decoders must include means to look for start codes in order to begin decoding correctly, and to identify 
errors, erasures or insertions while decoding. The methods to identify these situations, and the actions to 
be taken, are not standardised. 



5.2 Definition of functions 

Several utility functions for picture coding algorithm are defined as follows: 

5.2.1 Definition of bytealignedQ function 

The function bytealigned Q returns 1 if the current position is on a byte boundary, that is the next bit in 
the bitstream is the first bit in a byte. Otherwise it returns 0. 

5.2.2 Definition of nextbitsQ function 

The function nextbits Q permits comparison of a bit string with the next bits to be decoded in the 
bitstream. 



5.2 J Definition of next_^tart_codeO function 

The next_start_codeO function removes any zero bit and zero byte stuffing and locates the next start code. 



next_start_codeO { 

while ( IbytealignedQ ) 
zero_bit 

while ( nextbitsO '0000 0000 0000 0000 
zero.byte 

} 



oooor) 



No. of bits 



8 



Mnemonic 



'0' 



*0000 0000' 



This function checks whether the current position is byte aligned. If it is not, zero stuffing bits are present. 
Afler that any munber of zero stuffing bytes may be present before the start code. Therefore start codes are 
always byte aligned and may be prec^ied by any number of zero stuffing bits. 



53 



Reserved, forbidden and marker.bit 



The terms 'Reserved" and 'forbidden" are used in the description of some values of several fields in the 
coded bitstream. 

The term 'Reserved" indicates that die value may be used in the future for ISO/I£C|rnj-T defined 
extensions. 

The term "forbidden" indicates a value that shall never be used (usually in order to avoid emulation of 
start codes). 
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The term **marker_bif ' indicates a one bit integer in ^ch the value zero is forbidden (and it therefore 
shall have the value ' 1 These marker bits are introduced at several points in the syntax to avoid start 
code emulation. 

5.4 Arithmetic precision 

In order to reduce discrepancies between implementations of this specification, the following rules for 
arithmetic operations are specified. 

(a) Where arithmetic precision is not specified, such as in the calculation of the IDCT, the precision 
shall be sufficient so that significant errors do not occur in the final integer values 

(b) Where ranges of values are given by a colon, the end points are included if a bracket is present, 
and excluded if the iess than' (<) and 'greater than' (>) characters are used. For example, [a : b> 
means from a to b, including a but excluding b. 
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6 Video bitstream syntax and semantics 

6.1 Structure of coded video data 

Coded video data consists of an ordered set of video bitstreams, called layers. If there is only one layer, 
the coded video data is called non-scalable video bitstream. If there are two layers or more, the coded 
video data is called a scalable hierarchy. 

The first layer (of the ordered set) is called base layer, and it can always be decoded independently. See 
7.1 to 7.6 and 7.12 of this specification for a description of the decoding process for the base layer, excq>t 
in the case of Data partitioning, described in 7.10. 

Other layers are called enhancement layers, and can only be decoded together with all the lower layers 
(previous layers in the ordered set), starting with the base layer. See 7.7 to 7. 1 1 of this specification for a 
description of the decoding process for scalable hierarchy. 

See Recommendation ITU-T H.22G.0 1 ISO/IEC 13818-1 for a desaiption of the way layers may be 
multiplexed together. 

The base layer of a scalable hierarchy may conform to this specification or to other standards such as 
ISO/DSC 1 1 172-2. See details in 7.7 to 7.1 1. Enhancement layers shall conform to this specification. 

In all cases apart &am Data partitioning, the base layer does not contain a sequence^scalable^extensionQ. 
Enhancement layers always contain sequence.scalable.extensionQ. 

In general the video bitstream can be thought of as a syntactic hierarchy in w^ich syntactic structures 
contain one or more subordinate structures. For instance the structure '^picture.dataQ" contains one or 
more of the syntactic structure "sliceQ" whidi in turn contains one or more of the structure 
"macroblockO". 

This structure is very similar to that used in ISO/IEC 1 1 172-2. 
6.1.1 Video sequence 

The highest syntactic structure of the coded video bitstream is the video sequence. 

A video sequence commences with a sequence header which may optionally be followed by a group of 
pictures header and then by one or more coded fi-ames. The order of the coded frames in the coded 
bitstream is the order in which the decoder processes them, but not necessarily in the correct order for 
display. The video sequence is terminated by a sequence_end_code. At various points in the video 
sequence a particular coded frame may be preceded by either a repeat sequence header or a group of 
pictures header or both. (In the case that both a repeat sequence h^er and a group of pictures header 
immediately precede a particular picture, the group of pictures header shall follow the repeat sequence 
header.) 

6.1.1.1 Progressive and interlaced sequences 

This specification deals with coding of both progressive and interlaced sequences. 

The output of the decoding process, for interlaced sequences, consists of a series of reconstructed fields 
that are separated in time by a field period. The two fields of a fi'ame may be coded separately (field- 
pictures). Alternatively the two fields may be coded together as a fi^me (frame-pictures). Bath fiame 
pictures and field pictures may be used in a single video sequence. 
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In progressive sequences each picture in the sequence shall be a frame picture. The sequence, at the 
output of the decoding process, consists of a series of reconstructed frames that are separated in time by a 
fi^me period 

6.1.1.2 Frame 

A frame consists of three rectangular matrices of integers; a luminance matrix (Y), and two chrominance 
matrices (Cb and Cr). 

The relationship between these Y, Cb and Cr components and the primary (analogue) Red, Green and 
Blue Signals the chromaticity of these primaries and the transfer characteristics of 

the source frame may be specified in the bitstream (or specified by some otho: means). This information 
does not affect the decoding process. 

6.1.1J Field 

A field consists of every other line of samples in the three rectangular matrices of integers representing a 
firame. 

A frame is the union of a top field and a bottom field. The top field is the field that contains the top-most 
line of each of the three matrices. The bottom field is the other one. 

6.1.1.4 Picture 

A reconstructed picture is obtained by decoding a coded picture, i.e. a picture header, the optional 
extensions immediately following it, and the picture data. A coded picture may be a fi-ame picture or a 
field picture. A reconstructed picture is either a reconstructed frame (>^en decoding a frame picture), or 
one field of a reconstructed fiame (when decoding a field picture). 

6.1.1.4.1 Field pictures 

If field pictures are used then they shall occur in pairs (one top field followed by one bottom field, or one 
bottom field followed by one top field) and together constitute a coded frame. The two field pictures that 
comprise a coded frame shall be encoded in the bitstream in the order in which they shall occur at the 
output of the decoding process. 

When the first picture of the coded frame is a P-field picture, then the second picture of the coded frame 
shall also be a P- field picture. Similarly v/hen the first picture of the coded fi^me is a B-field picture the 
second picture of the coded firame shall also be a B-field picture. 

When the first picture of the coded frame is a I-field picture, then the second picture of the frame shall be 
either an I-field picture or a P-field picture. If the second picture is a P-field picture then certain 
restrictions apply, see 7.6.3.5. 

6.1.1.4.2 Frame pictures 

When coding interlaced sequences using fiame pictures, the two fields of the frame shall be interleaved 
with one another and then the entire firame is coded as a single fi-ame-picture. 

6.1.1.5 Picture types 

There are three types of pictures that use different coding methods. 
An Intra-coded (I) picture is coded using information only from itself. 

A Predictive-coded (P) picture is a picture which is coded using motion compensated prediction from a 
past reference fi*ame or past reference field. 
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A Bidirectionally predictive-coded (B) picture is a picture vAnch is coded using motion compensated 
prediction from a past and/or future reference frame(s). 

6.1.1.6 Sequence lieader 

A video sequence header commences with a sequence.header.code and is followed by a series of data 
elements. In this specification sequence__headerO shall be followed by sequence_extensionO >^ich 
includes further parameters beyond those used by ISO/IEC 11172-2. When sequence^extensionQ is 
present, the syntax and semantics defined in ISO/IEC 11172-2 does not apply, and the present 
specification applies. 

In repeat sequence headers all of the data elements with the permitted exception of those defining the 
quantisation matrices (load_intra_qxiantiser_matrix, load_non_intra_quantiser_matrix and optionally 
intra_quantiser_matrix and non_intra_quantiser_matrix) shall have the same values as in the first 
sequence header. The quantisation matrices may be redefined each time that a sequence header occurs in 
the bitstream (Note that quantisation matrices may also be updated using quant_matrix_extensionO)- 

All of the data elements in the sequence^extensionQ that follows a repeat sequenceJieaderQ shall have 
the same values as in the first sequence^extensionQ. 

If a sequence_scalable_extensionO occurs after the first sequence_headerO all subsequent sequence 
headers shall be followed by sequence_scalable_extensionO in which all data elements are the same as in 
the first sequence_scalable_extensionO. Conversely if no sequence.scalable.extensionO occurs between 
the first sequence.headerQ and the first picture.headerQ then sequence_„scalable_extensionO shall not 
occur in the bitstream. 

If a sequence_display_extensionO occurs after the first sequence.headerQ all subsequent sequence headers 
shall be followed by sequence_display_extensionO in ^ich all data elements are the same as in the first 
sequence^display.extensionQ. Conversely if no sequence_display_extensionO occurs between the first 
sequence.headerQ and the first pictureJieaderQ then sequence_display_extensionO shall not occur in the 
bitstream. 

Repeating the sequence header allows the data elonents of the initial sequence header to be repeated in 
order that random access into the video sequence is possible. 

In the coded bitstream, a repesit sequence head^ may precede either an I-picture or a P-picture but not a 
B-picture. In the case that an interlaced fi^me is coded as two separate field pictures a repeat sequence 
header shall not precede the second of these two field pictures. 

If a bitstream is edited so that all of the data preceding any of the repeat sequence headers is removed (or 
alternatively random access is made to that sequence header) then the resulting bitstream shall be a legal 
bitstream that complies with this specification. In the case that the first picture of the resulting bitstream 
is a P-picture, it is possible that it will contain non-intra macroblocks. Since the reference picture(s) 
required by the decoding process are not available, the reconstructed picture may not be fully defined. 
The time taken to fiilly refresh the entire frame depends on the refresh techniques employed. 

6.1.1.7 I-pictures and group of pictures header 

I-pictures are intended to assist random access into the sequence. Applications requiring random access, 
fiist-forward playback, or &st reverse playback may use I-pictures relatively frequently. 

I-pictures may also be used at scene cuts or other cases where motion compensation is ineffective. 

Group of picture header is an optional header that can be used inmiediately before a coded I-fi*ame to 
indicate to the decoder if the first consecutive B-pictures immediately following the coded I-frame can be 
reconstructed properly in the case of a random access. In effect, if the preceding reference fi*ame is not 
available, those B-pictures, if any, cannot be reconstructed properly unless they only use backward 
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prediction or intra coding. This is more precisely defined in the section desoibing closed_gop and 
brokenjink. A group of picture header also contains a time code information that is not used by the 
decoding process. 

In the coded bitstream, the first coded fi'ame following a group of pictures header shall be a coded I-firame. 
6.1.1.8 4:2:0 Format 

In this format the Cb and Cr matrices shall be one half the size of the Y-matrix in both horizontal and 
vertical dimoisions. The Y-matrix shall have an even number of lines and samples. 

NOTE - When interlaced fi-ames are coded as field pictures, the picture reconstructed fi'om each of 
these field pictures shall have a Y-matrix with half the number of lines as the corresponding 
frame. Thus the total number of lines in the Y-matrix of an entire frame shall be divisible by 
four. 

The luminance and chrominance samples are positioned as shown in Figure 6-1. 

In order to further specify the organisation. Figures 6-2 and 6-3 show the vertical and temporal 
positioning of the samples in an interlaced frame. Figures 6-4 shows the vertical and temporal 
positioning of the samples in an progressive fi'ame. 

In each field of an interlaced frame, the chrominance samples do not lie (vertically) mid way between the 
luminance samples of the field, this is so that the spatial location of the chjx>minance samples in the fi^me 
is the same whether the frame is represented as a single frame-picture or two field-pictures. 
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A Represent luminance samples 

O Represent chrominance samples 
Figure 6-1 — The position of luminance and clirominance samples. 4:2:0 data. 
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Figure 6-2 - Vertical and temporal positions of samples in an interlaced frame with top_field_f!rst 

= 1. 



20 



Recommendation ITU-T H^62 (1995 £) 



© ISO/EEC 



ISO/I£C 13818-2: 1995 (E) 



Bottom Top 
Field Field 

X 

o 

X 



o 

X 



X 



o 



X 



X 

o 



X 



Figure 6-3 - Vertical and temporal positioiis of samples in an interlaced frame with top_field_flrst = 

0. 
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time 

figure &4 - Vertical and temporal positions of samples in a progressive frame. 
6.1.1.9 4:2:2 Format 

In this format the Cb and Cr matrices shall be one half the size of the Y-matrix in the horizontal 
dimension and the same size as the Y-matrix in the vertical dimension. The Y-matrix shall have an even 
number of samples. 

NOTE - When interlaced frames are coded as field pictures, the picture reconstructed from each of 
these field pictures shall have a Y-matrix with half the number of lines as the corresponding 
frame. Thus the total number of lines in the Y-matrix of an entire fi'ame shall be divisible by 
two. 

The luminance and chrominance samples are positioned as shown in Figure 6-5. 

In order to clarify the organisation. Figure 6-6 shows the (vertical) positioning of the samples when the 
fitmie is separated into two fields. 
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Figure 6-5 — The podtion of Inminaiice and chrominance samples. 4:2:2 data. 
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Figure 6-6 — Vertical positions of samples with 4:2:2 and 4:4:4 data 
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6.1.1.10 4:4:4 Format 

In this fonnat the Cb and Cr matrices shall be the same size as the Y-matrix in the horizontal and the 
vertical dimensions. 

NOTE - When interlaced frames are coded as field pictures, the picture reconstructed fi-om each of 
these field pictures shall have a Y-matrix with half the number of lines as the corresponding 
fi*ame. Thus the total number of lines in the Y-matrix of an entire fiame shall be divisible by 
two. 

The luminance and chrominance samples are positioned as shown in Figures 6-6 and 6-7. 
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Figure 6-7 — The position of luminance and chrommance samples. 4:4:4 data. 



6.1.1.11 Frame reordering 

When the sequence contains coded B-fi-ames, the number of consecutive coded B-frames is variable and 
unbounded The first coded fi'ame after a sequence header shall not be a B-firame. 

A sequence may contain no coded P-frames. A sequence may also contain no coded I-fi'ames in v^ch 
case some care is required at the start of the sequence and within the sequence to effect both random 
access and error recovery. 

The order of the coded fi^es in the bitstream, also called coded order, is the order in which a decoder 
reconstructs them. The order of the reconstructed frames at the output of the decoding process, also called 
the display order, is not always the same as the coded order and this section defines the rules of fi'ame 
reordering that shall happen within the decoding process. 

When the sequence contains no coded B-fiames, the coded order is the same as the display OTder. This is 
true in particular always >^en low.delay is one. 

When B-firames are present in the sequence re-ordering is performed according to the following rules: 
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If the current frame in coded order is a B-fiame the output frame is the firame reconstructed from that B- 
fi^me. 

If the current fi^me in coded order is a I-fi^me or P-fiame the output frame is the frame reconstructed 
from the previous I-frame or P-frame if one exists. If none exists, at the start of the secjuence, no frame is 
output 

The frame reconstructed bom the final I-frame or P-frame in the sequence is output immediately after the 
frame reconstructed v/hm the last coded frame in the sequence was removed from the VBV buffer. 

The fi)llowing is an example of frames taken from the beginning of a video sequence. In this example 
there are two coded B-frames between successive coded P-fiames and also two coded B-frames between 
successive coded I- and P-frames and all pictures are frame-pictures. Frame *ir is used to form a 
prediction for frame *4P'. Frames '4P' and "IF are both used to firnn predictions for frames '2B' and 
*3B\ Therefore the order of coded frames in the coded sequence shall be 'IF, '4P*, •2B', '3B\ However, 
the decoder shall display them in the order 'IF, '2B', '3B', '4?\ 

At the encoder input, 

1 2 3 4 5 6 7 8 9 10 11 12 13 
I BBPBBPBBI BBP 

At the encoder output, in the coded bitstream, and at the decoder input, 

1 4 2 3 7 5 6 10 8 9 13 11 12 
I PBBPBBI BBPBB 

At the decoder output, 

1 2 3 4 5 6 7 8 9 10 11 12 13 



6.1.2 Stice 

A slice is a series of an arbitrary number of consecutive macroblocks. The first and last macroblocks of a 
slice shall not be skipped macroblocks. Every slice shall contain at least one macroblock. Slices shall not 
overlap. The position of slices may change from picture to picture. 

The first and last macroblock of a slice shall be in the same horizontal row of macroblocks. 

Slices shall occur in the bitstream in the order in >^ich they are encountered, starting at the upper-left of 
the picture and proceeding by raster-scan ord^ from left to right and top to bottom (illustrated in the 
Figures of this clause as alphabetical order). 

6.1.2.1 The general slice structure 

In the most general case it is not necessary for the slices to cover the entire picture. Figure 6-8 shows this 
case. Those areas that are not enclosed in a slice are not encoded and no information is encoded for such 
areas (in the specific picture). 

If the slices do not cover the entire picture then it is a requirement that if the picture is subsequently used 
to form predictions then predictions shall only be made from those regions of the picture that were 
enclosed in slices. It is the responsibility of the encoder to ensure this. 

This specification does not define what action a decoder shall take in the regions between the slices. 
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6.1.2.2 Restricted slice structure 

In certain defined levels of defined profiles a restricted slice structure illustrated in Figure 6-9 shall be 
used In this case every macroblock in the picture shall be enclosed in a slice. 




Figure 6-9. Restricted slice structure. 



Where a defined level of a defined profile requires that the slice structure obeys the restrictions detailed in 
this clause, the term ''restricted slice structure" may be used. 
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6.1*3 Macroblock 

A macroblock contains a section of the luminance component and the spatially corresponding 
chrominance components. The term macroblock can either refer to source and decoded data or to the 
corresponding coded data elements. A skipped macroblock is one for which no information is transmitted 
(see 7.6.6). There are three chrominance formats for a macroblock, namely, 4:2:0, 4:2:2 and 4:4:4 
formats. The orders of blocks in a macroblodc shall be different for each different chrominance format and 
are illustrated below: 

A 4:2:0 Macroblock consists of 6 blocks. This structure holds 4 Y, 1 Cb and 1 Cr Blocks and the block 
order is depicted in Figure 6-10. 
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Figure 6-10 4:2:0 Macroblock structure 

A 4:2:2 Macroblock consists of 8 blocks. This structure holds 4 Y, 2 Cb and 2 Cr Blocks and the block 
order is depicted in Figure 6-11. 
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Figure 6-11 4:2:2 Macroblock structure 

A 4:4:4 Macroblock consists of 12 blocks. This structure holds 4 Y, 4 Cb and 4 Cr Blocks and the block 
order is depicted in Figure 6-12. 
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Figure 6-12 4:4:4 Macroblock structure 



In frame pictures, ^ere both frame and field DCT coding may be used, the internal organisation within 
the macroblock is different in each case. 

In the case of frame DCT coding, each block shall be composed of lines from the two fields 
alternately. This is illustrated in Figure 6-13. 

In the case of field DCT coding, each block shall be composed of lines from only one of the two 
fields. This is illustrated in Figure 6-14. 

In the case of chrominance blocks the structure depends upon the chrominance format that is being used 
In the case of 4:2:2 and 4:4:4 formats (^ere there are two blocks in the vertical dimension of the 
macroblock) the chrominance blocks are treated in exactly the same manner as the limunance blocks. 
However, in the 4:2:0 format the chrominance blocks shall always be organised in fiame structure for the 
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purposes of DCT coding. It shoidd however be noted that field based predictions may be made for these 
blocks which will, in the general case, require that predictions for 8x4 regions (after half-sample filtering) 
must be made. 

In field pictures, each picture only contains lines fi'om one of the fields. In this case each block consists of 
lines taken firom successive lines in the picture as illustrated by Figure 6-13. 




Figure 6-13 — Luminance macroblock stmctore in frame DCT coding 




Figure 6-14 — Luminance macroblock structure in field DCT coding 



6.1.4 Block 

The term "block" can refer either to source and reconstructed data or to the DCT coefficients or to the 
corresponding coded data elements. 

When **block" refers to source and reconstructed data it refers to an orthogonal section of a Itmiinance or 
chrominance component with the same number of lines and samples. There are 8 lines and 8 samples in 
the block. 
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6.2 Video bitstream syntax 

6.2.1 Start codes 

Start codes are specific bit pattmis that do not otherwise occur in the video stream. 

Each start code consists of a start code prefix fi)llowed by a start code value. The start code prefix is a 
string of twenty three bits with the value zero followed by a single bit with the value one. The start code 
prefix is thus the bit string *0000 0000 0000 0000 0000 0001 '. 

The start code value is an eight bit integer which identifies the type of start code. Most types of start code 
have just one start code value. However slice_start_code is represented by many start code values, in this 
case the start code value is the slice_vertical_position for the slice. 

All start codes shall be byte aligned. This shall be achieved by inserting bits with the value zero before 
the start code prefix such that the first bit of the start code prefix is the first (most significant) bit of a byte. 

Table 6-1 defines the slice start code values for the start codes used in the video bitstream. 



Table 6-1 — Start code values 



name 


start code value 




(hexadecimal) 


picture_start_code 


00 


slice_start_code 


01 through AF 


reserved 


BO 


reserved 


Bl 


user_data_start_code 


B2 


sequence_header_code 


B3 


sequence_error_code 


B4 


extension_start_code 


B5 


reserved 


B6 


sequence_end_code 


B7 


group_start_code 


38 


system start codes (see note) 


B9 thrcmgh FF 


NOTE - system start codes are defined in Part 1 of this specification 



The use of the start codes is defined in the following syntax description with the exception of the 
sequence_error_code. The sequence_error_code has bom allocated for use by a media interfiice to indicate 
v/here uncorrectable errors have been detected. 



Recommendation ITU-T H.262 (1995 £) 



29 



ISO/mC 13818-2: 1995 (£) 



6J1.2 



Video Sequence 



videousequenceQ { 
next_stait_codeO 
sequence_headerO 

if ( nextbitsQ = exteiision_start_code ) { 
sequence_extensionO 
do{ 

extensioii^andLiiser_data( 0 ) 
do{ 

if (nextbitsQ = group_stait_code) { 
group_ofj)icturesJieaderO 
extension_and_user_data( 1 ) 

} 

picture_headerO 
picture_oodjiig_extensionO 
extcaisions_and_user_data( 2 ) 

picture_dataO 
} while ( (nextbitsQ = picture_start_code) 
(nextbitsQ = group_start_code) ) 
if ( nextbitsQ !- sequence_end_code ) { 

sequence_headerQ 

sequence.extensionQ 

} 

} ^diile ( nextbitsQ != seq[uence_end_code ) 
} else { 

/♦ ISO/IEC 11172-2 */ 

} 

sequence_end_code 

} 



Na of bits 



32 



Mnemonic 



bslbf 
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6.2.2.1 Sequence header 



sequence_headerO { 


No. of bits 


Mnemonic 


seqiience_header_code 


32 


bslbf 


horizonta]_size_vaIue 


12 


uimsbf 


vertical_size_value 


12 


uimsbf 


aspect_ratio_iiiformation 


4 


iiimsbf 


frame_rate_code 


4 


uimsbf 


bit_rate_value 


18 


uimsbf 


marker_bit 


1 


bslbf 


vby_biiffer_size_value 


10 


uimsbf 


constrained^parameters.flag 


1 


bslbf 


loadJiitra_quantiser_matrix 


1 


uimsbf 


if ( load_mtra_quaiitiser_matrix ) 






iiitra_qiiaiitiser_.mat]ix[64] 


8*64 


uimsbf 


Ioad_iion_mtra_quaiitiser_matru 


1 


uimsbf 


if ( load_n(m_intra_quantiser_inatrix ) 






non_intra_qiiantiser_matrix[64] 


8*64 


uimsbf 


next_start_codeO 

} 







6.2.2.2 Extension and user data 



extension_and_user.data( i ) { 


No. of bits 


Mnemonic 


while ( ( nextbitsO= extension^start_code ) || 






( nextbitsO= user_data_stait_code ) ) { 






if ( ( i != 1) && ( nextbitsO= extension_start_code ) ) 






extension_data( i ) 






if ( nextbitsO== user_data_start_code ) 






user.dataO 

} 

} 
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6.2.2^.1 Extension data 



exteiisiQn_data( i ) { 

while ( nextbitsO= extcnsion_start_code ) { 
extension_start_code 

if (i = 0) { /* follows sequence.extensionO */ 

if ( nextbitsO= "Sequence Display Extension ID" ) 

sequence_display_extensionO 

else 

sequence_scalable_extensionO 

} 

/* NOTE - i never takes the value 1 because extension.dataQ 
never follows a group_ofj)ictures_headerO */ 
if (i = 2) { /* follows picture_coding_extensionO */ 
if ( nextbitsO = "Quant Matrix Extension ID" ) 

quant.matrix.extensionQ 
else if ( nextbitsQ = '*Copyright Extension ID" ) 

cqpyright_extensionO 
else if ( nextbitsQ — '*Picture Display Extension ID" ) 

picture.display.extensionO 
else if ( nextbitsQ 

= 'Ticture Spatial Scalable Extension ID" ) 
picture_spatial_scalable_extensionQ 

else 

picture_tempora]_scalable_extensionQ 



} 



} 



} 



No. of bits 



32 



Mnemonic 



bslbf 



6.2.2.2.2 User data 



user_dataO { 

aser_data_start_code 

while( nextbitsQ != *0000 0000 0000 0000 0000 0001' ) { 
user.data 

} 

next_start_codeQ 

} 



No. of bits 
32 

8 



Mnemonic 
bslbf 

uimsbf 
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6J2J13 Sequence extension 



sequaice^extensionO { 
extension_start_code 
extension.start_codeJdentifier 
profile.andjeveljndication 
progressive^sequence 
chroma.format 
horizontal_size_extension 
vertical_size_extension 
bit_rate_extension 
marker.bit 

vbv_buffer_size_extension 
low_delay 

frame_rate_extension^n 
frame_rate_extension^d 
next_stait_codeO 

} 



No. of bits 


Mnemonic 


32 


bslbf 


4 


uimsbf 


8 


uimsbf 


1 


uimsbf 


2 


uimsbf 


2 


uimsbf 


2 


uimsbf 


12 


uimsbf 


1 


bslbf 


8 


uimsbf 


1 


uimsbf 


2 


uimsbf 


5 


uimsbf 



6.2.2.4 Sequence display extension 



sequence.display.extensionO { 


No. of bits 


Mnemonic 


extensioiL.start_code_identifler 


4 


uimsbf 


video^format 


3 


uimsbf 


colour.description 


1 


uimsbf 


if ( colour_description ) { 






colour_primaries 


8 


uimsbf 


transfer.characteristics 


8 


uimsbf 


matrix_coefficients 

} 

display_horizontal_size 


8 


uimsbf 


14 


uimsbf 


marker_bit 


1 


bslbf 


display_vertical_size 


14 


uimsbf 


next_start_codeO 

} 
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6^ Sequence scalable extension 



No. of bits 


Mnemonic 


4 


uimsbf 


2 


uimsbf 


4 


uimsbf 


14 


uimsbf 


1 


bslbf 


14 


uimsbf 


5 


iiirncKf 


5 


uimsbf 


5 


uimsbf 


5 


uimsbf 


1 


uimsbf 


1 


uimsbf 


3 


uimsbf 


3 


uimsbf 



sequence_scalable_extensionO { 
extension_start_code_identifier 
scalable.mode 
layer Jd 

if (scalable^mode = "spatial scalability*) { 
lower Jayer_predictioq_borizontal_size 
marker.bit 

lower_layer_4>rediction_vertical_si2e 

Iioiizontal_subsampling_factor_m 

liorizontal_subsampling_factor_n 

vertical_subsampling_factor_m 

vertical_subsampling_factor_n 

} 

if ( scalable_mode — "temporal scalability" ) { 
pictare_mniL.enable 
if ( picture_mux_enable ) 

muxjto_progre5sive_sequence 
picture.mn^order 
picture_mu^factor 



next_start_codeO 



} 



6J2J2.6 Group of plctnres header 



group_oCpictures_headerO { 


No. of bits 


Mnemonic 


group_start_code 


32 


bslbf 


time_code 


25 


bslbf 


closed_gop 


1 


uimsbf 


broken_link 


1 


uimsbf 


next_start_codeO 

} 
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6.23 



Picture header 



pictureJieaderO { 
picture.start.code 
temporal.reference 
picti]re_codmg_type 
vbv_delay 

if ( pictiire_codmg_type = 2 || picture_codiBg_type 
full_pel_forward_vcctor 
forward_f_code 

} 

if ( picture_coding_type = 3 ) { 
fiill_peLbackward_vector 
backward_f_code 

} 

v»^ile(nextbitsO = '!'){ 

extra_bit4)icture /* with the value ' r */ 
extra Jnfomiatioii^picture 

} 

extra_bit .picture /* with the value '0' */ 
next__start_codeO 

} 



= 3){ 



No. of bits 

32 
10 
3 

16 

1 
3 



1 

3 



1 
8 



Mnemonic 

bslbf 
uimsbf 
uimsbf 
uimsbf 

bslbf 
bslbf 



bslbf 
bslbf 



uimsbf 
uimsbf 

uimsbf 
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6.2.3.1 Picture coding extensioii 



picture.coding^ext^siQnO { 
exteiision_start_code 

extension_start_code_ideiitifier 
f_code[0][0] /* forward horizontal */ 
f.code|0][l] /* forward vertical */ 
f_code[l]{0] /* backward horizontal */ 
f_code[l][l] /* backward vertical */ 
intra_dc_precision 
picture__structure 
top„field_first 
frame_pred_frame_dct 
concealment_niotion_vectors 
q_scale_type 
intra_^c_format 
altemate_scan 
repeat_first_field 
clironia_420_type 
progressive_franie 
composite.display^flag 
if ( composite_display_flag ) { 
y_axis 

field_sequence 
sub_carrier 
burst.amplitude 
sub_carrier_phase 



} 

iiext_start_codeO 



} 



No. of bits 


Mnemonic 


32 


bslbf 


4 


uimsbf 


4 


uimsbf 


4 


uimsbf 


4 


uimsbf 


4 


uimsbf 


2 


uimsbf 


2 


uimsbf 


1 


uimsbf 


1 


uimsbf 


J 


uimsbf 




iiinncH'f 


1 


uimsbf 


1 


uimsbf 


1 


uimsbf 


1 


uimsbf 


! 


uimsbf 




uimsbf 




uimsbf 


3 


uimsbf 




uimsbf 


7 


uimsbf 


8 


uimsbf 
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6232 Quant matrix extensioii 



quant jamtrix^extensionO { 


No. of bits 


Mnemonic 


extension_start_code_identifier 


4 


uimsbf 


loadjntra.quantiser.matrix 


1 


uimsbf 


if ( loa(Lmtra_quantiser^atrix ) 






intra_qiiantiser_inatrix|64] 


8*64 


uimsbf 


load_non_intra_qiiantiser_matiix 


1 


uimsbf 


if ( load^on^intra_quantiser_matrix ) 






non_intra_qiiantiser_matrix|64] 


8*64 


uimsbf 


load_clironiaJntra_qiiantiser_matrix 


1 


uimsbf 


if ( load.chroma_intra_quantiser_matrix ) 






chronia_mtra.qiiantiser_matrixl64] 


8*64 


uimsbf 


load_clironia_non_intra_qiiantiser_niatrix 


1 


uimsbf 


if ( load_chroma_non_intra_quantiser_matrix ) 






chroma_non_intra_quantiser_matrix[64] 


8*64 


uimsbf 


next_stait_codeO 

} 







6.2 J J Picture display extension 



picture_display_extensionO { 

extension^tart.codejdentilier 

for ( i=0; i<number_of_firame_centre_ofl&ets; i-H- ) { 

frame_centre_horizonta]_ofE5et 

marker.bit 

frame_centre_vertical_ofrset 
marker.bit 



} 

next_start_codeO 



} 



No. of bits 
4 

16 
1 

16 
1 



Mnemonic 

uimsbf 

simsbf 
bslbf 
simsbf 
bslbf 



Reco] 
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6J23A Picture temporal scalable extension 



picturejtemporal_scalable_exteDsiQnO { 


No. of bits 


Mnemonic 


extension_start_code_identifier 


4 


uimsbf 


reference_seiect_code 


2 


uimsbf 


forward_tempora]_reference 


10 


uimsbf 


marker.bit 


1 


bslbf 


backward_temporal_reference 


10 


uimsbf 


next_start_codeO 

} 







6.23.5 Picture spatial scalable extension 



picture_spatial_scalable_extensionO { 


No. of bits 


Mnemonic 


extension_start_codeJdentilier 


4 


uimsbf 


lowerJayer_temporal_reference 


10 


uimsbf 


marker_bit 


1 


bslbf 


lower_layer_horizoDtal_ofrset 


15 


simsbf 


marker.bit 


1 


bslbf 


lower Jayer_vertical_ofFset 


15 


simsbf 


spatiaLtemporal.weight_code_table_index 


2 


uimsbf 


lower_layerjirogressive_frame 


1 


uimsbf 


lower_layer_deinterIaced_fleId_select 


1 


uimsbf 


next_start_codeO 

} 
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6«2.3.6 Copyright extensioii 



cq)yright_extensioiiO { 


No. of bits 


Mnemonic 


exteiisioii_start_code_identifier 


4 


uimsbf 


copyright_flag 


1 


bslbf 


copyright.identifier 


8 


uimsbf 


origiiial_or_copy 


1 


bslbf 


reserved 


7 


uimsbf 


marker.bit 


1 


bslbf 


copyright_number_l 


20 


uimsbf 


marker^bit 


1 


bslbf 


copyright_niimber_2 


22 


uimsbf 


marker^bit 


1 


bslbf 


copyright_niimber_3 


22 


uimsbf 


next_stait_codeO 

} 







6.2 J.7 Picture data 



picture^dataQ { 


No. of bits 


Mnemonic 


do{ 






sliceQ 






} while ( nextbitsQ = slice_5tart_code ) 






next_start_codeO 

} 
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6.2.4 



SUce 



sliceOi 

slice_start_code 

if (vertical_size > 2800) 

slice_vertical_positioiuexteii$ioii 
if {<sequence_scalable_extension0 is present in the bitstream>) { 
if (scalable_^lode = *'data partitiomng" ) 
priority.brealqpoiiit 

} 

qiiaiitiser_scaIe_code 

if(nextbitsO = 'l'){ 
intra_slice_flag 
intra_slice 
reserved_bits 

while (nextbitsO = '!•){ 

extra3it_slice /* with the value ' 1' ♦/ 
extra_iiifoniiatioii_slice 

} 

} 

extra.bi01ice /* with the value '0' */ 
do{ 

macroblockO 

} while ( nextbitsO != '000 0000 0000 0000 0000 0000' ) 
next_start_codeO 

} 



No. of bits 
32 



1 
1 

7 

1 
8 



Mnemonic 

bslbf 

uimsbf 



uimsbf 

uimsbf 

bslbf 

uimsbf 

uimsbf 

uimsbf 
uimsbf 



uimsbf 
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6J13 Macroblock 



macroblockO { 


No. of bits 


Mnemonic 


vMIg ( nextbitsO = '0000 0001 000' ) 






iiiacroblock.escape 


11 


bslbf 


inacroblocK.address_mcrement 


1-11 


vlclbf 


macroblodunodesO 






if ( macroblock_quant ) 






qiiaiitiser_scale_code 


5 


uimsbf 


if ( macroblock_motion_forward || 






( macroblock_intra && concealment.modon.vectors) ) 






motion_vectQrs( 0 ) 






if ( macroblock_motion_backward ) 






motion_vectors( 1 ) 






if ( macroblock_intra && concealment_motion_vectQrs) 






iiiarker_bit 


1 


bslbf 


if ( macroblock^attera ) 






coded_block_patteraO 






for ( i=0; i<block_count; i++ ) { 






block( i ) 

} 

} 
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6.2.5.1 



Macroblock modes 



macroblock_inodesO { 
macroblock_type 

if ( ( spatial_temporal_weight_code_flag = 1 ) && 

( spatial_tempQral.weight_codejtable_index != *00') ) { 
spatial_temporal_weight_code 

} 

if ( macroblock^motion_forward || 
macroblock_motion_backward ) { 
if ( picture_structure = *frame' ) { 
if ( frame_pred_frame_dct = 0 ) 
frame_motion_type 

} else { 

field_motion_type 

} 

} 

if ( ( picture^structure = "Frame picture** ) && 
( frame_pred_frame_dct = 0 ) && 
( macroblock_intra || macoblock^attem) ){ 
dctjype 

} 

} 



Na of bits 

1-9 



Mnemonic 
vlclbf 



uimsbf 



uimsbf 



uimsbf 



uimsbf 



6.2.5.2 Motion vectors 



motion_vectors ( s ) { 


No. of bits 


Mnemonic 


if ( motion_vectQr_count = 1 ) { 






if (( mv_format — field )&&( dmv !- 1)) 






motion_verticaLfield_select [0] [s] 


1 


uimsbf 


motion_vector( 0, s ) 






} else { 






motion_vertical_field_select[0] [s] 


1 


uimsbf 


motion_vector( 0, s ) 






motion_vertical_lield_select[l][s] 


1 


uimsbf 


motiQn_vectQr(l, s ) 

} 

} 
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6.2.5.2.1 Motion vector 



motionjvector ( s ) { 
motioii_code[r] [s] [0] 

if ( ( f_code[s][0] !-!)&&( motion_code[r][s][0] ! 

motioii_residiial [r ] [s] [0] 
if (dmv = 1) 

dinvector[0] 
motion.code|r] {s] [1] 

if ( ( f.code[s][l] !=!)&&( niotion.code[r][sl[l] ! 

motioii_residiial[r] [s] [1] 
if (dmv = 1) 

dmvector[l] 

} 



= 0)) 



= 0)) 



Na of bits 

1-11 

1-8 

1-2 
1-11 

1-8 

1-2 



Mnemonic 

vlclbf 

uimsbf 

vlclbf 
vlclbf 

uimsbf 

vlclbf 



6.2.53 Coded block pattern 



coded_block_pattern 0 { 


No. of bits 


Mnemonic 


coded.bloclc_pattem_420 


3-9 


vlclbf 


if ( chroma_format = 4:2:2 ) 






coded_blockjiattem_l 


2 


uimsbf 


if ( chroma_format = 4:4:4 ) 






coded_block_4)attem_2 

} 


6 


uimsbf 
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6^.6 



Block 



The detailed syntax for die terms 'Tirst DCT coefficient", ''Subsequent DCT coefficioir and "End of 
Block" is fully described in 7.2. 

This clause does not adequately document the block layer syntax >^en data partitioning is used. See 7. 1 0. 



block(i) { 

if ( pattem„code[i] ) { 

if ( macroblock_intra ) { 
if(i<4){ 

dct.dc.sizejuminance 
if(dct_dc_sizejuminance != 0) 
dct_dc_differential 

} else { 

dct_dc_size_chroiiiiiiance 
if(dct_dc_size_chrQminance !=0) 
dct_dc_differential 

} 

} else { 

First DCT coefficient 

} 

^ile ( nextbitsQ != End of block ) 

Subsequent DCT coefficients 
End of block 



} 



} 



Na of bits 



2-9 



1-11 



2-10 



1-11 



2-24 



3-24 
2or4 



Mnemonic 



vldbf 



uimsbf 



vlclbf 



uimsbf 



vlclbf 
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63 



Video bitstream semantics 



63.1 



Semantic rules for higher syntactic structures 



This clause details the rules that govern the way in ^ich the higher level syntactic elements may be 
combined together to produce a legal bitstream. Subsequent clauses detail the semantic meaning of all 
fields in the video bitstream. 

Figure 6-15 illustrates the high level structure of the video bitstream. 
The following semantic rules apply: 

• If the first sequenceJieaderQ of the sequence is not followed by sequence.extensionQ then the 
stream shall conform to ISO/IEC 1 1 172-2 and is not documented within this specification. 

• If the first sequenceJieaderQ of a sequence is followed by a sequence.extensionO thai all 
subsequent occurrences of sequence.headerQ shall also be immediately followed by a 
sequence.extensionQ. 

• sequence.extensionO shall only occur inmiediately following a sequence^headerQ. 

• Following a sequence_headerO there shall be at least one coded picture before a repeat 
sequence_headerO or a sequence_end_code. This implies that sequence.extensionQ shall not 
immediately precede a sequence_end_code. 

• If sequence.extensionO occurs in the bitstream then each picture^headerQ shall be followed 
immediately by a picture_coding_extensionO- 

• sequence_end_code shall be positioned at the end of the bitstream such that, after decoding and 
fi*ame reordering, there shall be no missing firames. 

• picture_coding_extensionO shall only occur immediately following a picturcheaderQ. 

• The first coded fi-ame following a group_otj>icturesJieaderO shall be a coded I-fi-ame. 

A mmiber of different extensions are defined in addition to sequence.extensionQ and 
picture.coding.extensionO. The set of allowed extensions is different at each different point in the syntax 
where extensions are allowed. Table 6-2 defines a four bit extension_start_code_identifier for each 
extension. 
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^ISO/EEC 11172-2 




* After a GOP the first picture shall be an I-picture 

Figure 6-15. High level bitstream organisation 



At each point where extensions are allowed in the bitstream any number of the extensions from the 
defined allowable set may be included. However each type of extension shall not occur more than once. 

In the case that a decoder encounters an extension with an extension identification that is described as 
''reserved" in this specification the decoder shall discard all subsequent data until the next start code. 
This requiremrait allows fiiture definition of compatible extensions to this specification. 



Table 6-2. extenslon_start_codeJdentifier codes. 



extensioiustart_code_identifier 


Name 


0000 


reserved 


0001 


Sequence Extension ID 


0010 


Sequence Display Extension ID 


0011 


Quant Matrix Extension ID 


0100 


Copyright Extension ID 


0101 


Sequence Scalable Extension ID 


Olio 


reserved 


0111 


Picture Display Extension ID 


1000 


Picture Coding Extension ID 


1001 


Picture Spatial Scalable Extension ID 


1010 


Picture Temporal Scalable Extension ID 


1011 


reserved 


1100 


reserved 


• * • 

nil 


• » • 

reserved 
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63.2 



Video sequence 



seqiience_end_code — The sequeiice_end_code is the bit string '000001B7' in hexadecimal. It terminates 
a video sequence. 



sequence_header_code — The sequencejieader.code is the bit string '000001B3' in hexadecimal. It 
identifies the beginning of a sequence header. 

horizontal_size_valae — This word forms the 12 least significant bits of horizontal^size. 

verdcal.size.value — This word fi>mis the 12 least significant bits of vertical_size. 

horizontal_size ~ The horizontal.size is a 14-bit unsigned integer, the 12 least significant bits are 
defined in horizQntal_size_value, the 2 most significant bits are defined in horizontal_size_extension. The 
horizontal_size is the width of the displayable part of the luminance component of pictures in samples. 
The width of the encoded luminance component of pictures in macroblocks, mb_width, is 
(horizontal_size + 15)/ 16. The displayable part is left-aligned in the encoded pictures. 

In order to avoid start code emulation horizontaLsize^value shall not be zero. This precludes values of 
horizontal_size that are multiples of 4096. 

vertical^size — The vertical_size is a 14-bit unsigned integer, the 12 least significant bits are defined in 
vertical_size_value, the 2 most significant bits are defined in vertical^size.extension. The vertical_size is 
the height of the displayable part of the luminance component of the fiame in lines. 

In the case that progressive.sequence is the height of the encoded luminance component of frames in 
macroblocks, mbjieight, is (vertical_size + 15)/ 16. 

In the case that progressive_sequence is '0' the height of the encoded limiinance component of frame 
pictures in macroblocks, mb.height, is 2*((vertical_size + 31)/32). The height of the encoded luminance 
component of field pictures in macroblocks, mb.height, is ((vertical_size + 31)/32). 

The displayable part is top-aligned in the encoded pictures. 

In order to avoid start code emulation vertical_size_value shall not be zero. This precludes values of 
vertical^size that are multipleis of 4096. 

aspect_ratio_infoniiation — This is a four-bit integer defined in the Table 6-3. 

aspect_ratio_information either specifies that the "sample aspect ratio" (SAR) of the reconstructed frame 
is 1,0 (square samples) or alternatively it gives the "display aspect ratio" (DAR). 

• If sequence.display.extensionO is not present then it is intended that the entire reconstructed 

frame is intended to be mapped to the entire active region of the display. The sample aspect ratio 
may be calculated as follows: 



633 



Sequence header 



SAR ^ DAR X 



horizontal ^size 
vertical _size 



NOTE- 



In this case horizontal_size and vertical.size are constrained by the SAR of the source and 
the DAR selected. 
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If sequence.dispIay.exteQsianO is present then the sample aspect ratio may be calculated as 
follows: 



SAR^DARx 



display^ horizontal _size 
display _vertical ^ size 



Table 6-3 — aspect_ratio.inforniation 



aspect_ratio_iiiformation 


Sample Aspect 
Ratio 


DAR 


0000 


forbidden 


forbidden 


0001 


1,0 (Square 
Sample) 




0010 




3^ 


0011 




9-16 


0100 




1-2,21 


0101 




reserved 


• • • 

nil 




• • * 

reserved 



frame_rate_code — This is a four-bit integer used to define frame.irate.value as shown in Table 6-4. 
frame^rate may be derived from fiame_rate_value, frame.rate_extension_p and frame.rate_extension_d 

as follows: 



fi^e.rate = frame_rate_value * (frame_rate_extensionji + 1) - (fi^e_rate_extension_d + 1) 

When an entry for the frame rate exists directly in Table 6-4, frame__rate_extension_n and 
frame_rate_extension_d shall be zero. (frame_rate_extension_n +1) and (frame_rate_extension_d +1) 
shall not have a common divisor greater than one. 



If progressive_sequence is *r the period between two successive frames at the output of the d 
process is the reciprocal of the frame^te. See Figure 7-18. 



g 



If progressive_sequence is '0' the period between two successive fields at the output of the decoding 
process is half of the reciprocal of the frame_rate. See Figure 7-20. 

The fiame_rate signalled in the enhancement layer of temporal scalability is the combined frame rate 
after the temporal remultiplex operation if picture.mux.enable in the sequence.scalable.extensionQ is set 
toT. 
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Table 6-4 — frame_rate_value 



frame_rate_code 


lrame_rate_value 


0000 


forbidden 


0001 


24 000^1001 (23,976...) 


0010 


24 


0011 


25 


0100 


30 00(K1001 (29,97...) 


0101 


30 


0110 


50 


0111 


60 000^1001(59,94...) 


1000 


60 


« • • 


reserved 


nil 


reserved 



bit.ratejvalue — The lower 18 bits of bit_rate. 

bit.rate — This is a 30-bit integer. The lower 18 bits of the integer are in bit_rate_value and the upper 12 
bits are in bit_rate_extension. bit_rate is measured in units of 400 bits/second, rounded upwards. The 
value zero is forbidden. 

The bitrate specified bounds the maximum rate of operation of the VBV as defined in C.3 of annex C. 

The VBV operates in one of two modes depending on the coded values in vbv_delay. In all cases (both 
constant and variable bitrate operation) the bitrate specified shall be the upper boimd of the rate at which 
the coded data is supplied to the input of the VBV. 

NOTE - Since constant bitrate operation is simply a special case of variable bitrate operation there is 
no requirment that the value of bit_rate is the actual bitrate at ^ich the data is supplied. 
However it is recommended in the case of constant bitrate operation that bit_rate should 
represent the actual bitrate. 

marker.bit - This is one bit that shall be set to ' 1'. This bit prevents emulation of start codes. 

vbv_bufler_size_value — the lower 10 bits of vbyj5uffer_size. 

vbv_buffer_size — vbv_Jjuffer_size is an 18-bit integer. The lower 10 bits of the integer are in 
vbv_bufrer_size_value and the upper 8 bits are in vbv_buffer_size_extension. The integer defines the 
size of the VBV (Video Buffering Verifier, see Annex C) buffer needed to decode the sequence. It is 
defined as: 

B = 16 ♦ 1024 * vbv_buffer_size 

where B is the minimum VBV buffer size in bits required to decode the sequence (see Annex C). 

constrained_parameters_IIag — This flag (used in ISO/IEC 11172-2) has no meaning in this 
specification and shall have the value '0*. 

load_intra_quantiser_matriz — See 6.3.1 1 'Xjuant matrix extension" 
intra_quantiser_matrix — See 6.3.1 1 ""Quant matrix extension" 
load_non_intra_quantiser_matrix — See 6.3.1 1 "Quant matrix extension" 
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noii_intra_quaiitiser_matrix — See 6.3.1 1 ^'Quant matrix extension" 

63.4 Extension and user data 

extension_start__code — The extension_start_code is the bit string *000001B5' in hexadecimal. It 
identifies the beginning of extensions beyond ISO/IEC 1 1 172-2. 

6.3.4.1 User data 

user_data_start_code - The iiser_data_start_code is the bit string *000001B2' in hexadecimal. It 
identifies the beginning of user data. The user data continues until receipt of another start code. 

user.data - This is an 8 bit integer, an arbitrary number of which may follow one another. User data is 
defined by users fcac their specific applications. In the series of consecutive user_data bytes there shall not 
be a string of 23 or more consecutive zero bits. 

63.5 Sequence extension 

extension_start_code_identifier — This is an 4-bit integer ^^ch identifies the extension. See Table 6-2. 

profile_and.level_indication ~ This is an 8-bit integer used to signal the profile and level identification. 
The meaning of the bits is given in clause 8. 

NOTE - In a scalable hierarchy the bitstreams of each layer may set profile_and_level_indicatiQn to a 
dififerent value as specified in clause 8. 

progressive_sequence — When set to the coded video sequence contains only progressive frame- 
pictures. When progressive^sequence is set to *0' the coded video sequence may contain both fi-ame- 
pictures and field-pictures, and frame-picture may be progressive or interlaced fi^mes. 

cliroma_fomiat — This is a two bit integer indicating the chrominance format as defined in the Table 6- 
5. 



Table 6-5. Meaning of chroma_forniat 



chroma.format 


Meaning 


00 


reserved 


01 


4:2:0 


10 


4:2:2 


11 


4:4:4 



horizontal_size_extension — This 2 bit integer is the 2 most significant bits from horizQntal_size. 

verticaI_size_extension — This 2 bit integer is the 2 most significant bits from vertical_size. 

bit_rate_exten8ion — This 12 bit integer is the 12 most significant bits from bit^rate. 

vbv_bufrer_size_extension — This 8 bit integer is the 8 most significant bits from vbyjbuffer^^ize. 

low.deiay — This flag, when set to "T, indicates that the sequence does not contain any B-pictures, that 
the firame reordering delay is not present in the VBV description and that the bitstream may contain "big 
pictures", i.e. that C.7 of the VBV may apply. 

When set to *0\ it indicates that the sequence may contain B-pictures, that the frame reordering delay is 
present in the VBV description and that bitstream shall not contain big pictures, i.e. C.7 of the VBV does 
not apply. 
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This flag is not used during the decoding process and therefore can be ignored by decoders, but it is 
necessary to define and verify the compliance of low-delay bitstreams. 

franie_rate_extension_n — This is a 2 bit integer used to determine the fi^me_rate. See fiame_rate_code. 
frame_rate_extension_d — This is a 5 bit integer used to determine the fitime^rate. See frame_rate_code. 

63.6 Sequence display extension 

This specification does not define the display process. The information in this ejctension does not affect 
the decoding process and may be ignored by decoders that conform to this specification. 

video.format — This is a three bit integer indicating the representation of the pictures before being coded 
in accordance with this specification. Its meaning is defined in Table 6-6. If the 
sequence_display_extensionO is not present in the bitstream then the video format may be assumed to be 
'"Unspecified video format". 



Table 6^. Meaning of video.format 



video.format 


Meaning 


000 


component 


001 


PAL 


010 


NTSC 


oil 


SECAM 


100 


MAC 


101 


ihispecified video format 


110 


reserved 


111 


reserved 
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colour_descriptioii — A flag v^ich if set to '1' indicates the presence of colour^rimaries, 
transfer.characteristics and matrix.coefi5cients in the bitstream. 

coIour_primaries — This 8-bit integer describes the chromaticity coordinates of the source primaries, and 
is defined in Table 6-7. 



Table 6-7* Colour Primaries 



Value 



Primaries 



0 
1 



3 
4 



(forbidden) 

Recommendation ITU-R BT.709 



pnmary X 


y 




green 


0,300 


0,600 


blue 


0,150 


0,060 


red 


0,640 


0,330 


white D65 


0,3127 


0,3290 



Unspecified Video 

Image characteristics are unknown. 

reserved 

Recommendation ITU-R BT.470-2 System M 



pnmary x 


y 




green 


0,21 


0,71 


blue 


0,14 


0,08 


red 


0,67 


0,33 


white C 


0,310 


0,316 



Recommendation ITU-R BT.470-2 System B, G 



pnmary x 



blue 
red 

white D65 

SMPTE 170M 

primary x 

green 

blue 

red 

vAntQ D65 

SMPTE 240M 

primary x 

green 

blue 

red 

white D65 



y 

0,29 
0,15 
0,64 
0,313 

y 

0,310 
0,155 
0,630 
0,3127 
(1987) 

y 

0,310 
0,155 
0,630 
0,3127 



0,60 
0,06 
0,33 
0,329 



0,595 
0,070 
0,340 
0,3290 



0,595 
0,070 
0,340 
0,3291 
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I 8-255 I reserved | 

In die case that sequenoe_display_extensionO is not present in the bitstream or colour.description is zero 
the chromaticity is assumed to be that corresponding to colour_priniaries having the value 1 . 
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transfer^characteristics — This 8-bit integer describes the opto-electronic transfer characteristic of the 
source picture, and is defined in Table 6-8. 

Table 6-8. Transfer Characteristics 



Value 



0 
1 



3 
4 



8 



9-255 



Transfer Characteristic 



(forbiddoi) 

Recommendation ITU-R BT.709 
V= 1,099 Lc^''^^- 0,099 
for l>Lc> 0,018 

V = 4,500 Lc 

for 0,01 8>Lc^0 

Unspecified Video 

Image characteristics are unknown. 

reserved 

Recommendation ITU-R BT.470-2 System M 
Assumed display gamma 2,2 

Recommendation ITU-R BT.470-2 System B, G 
Assumed display ganuna 2,8 

SMPTE 170M 

V= 1,099 Lc0»45_ 0,099 
for l^L^^ 0,018 

V = 4,500 Lc 

forO,018>Lc>0 
SMPTE 240M (1987) 
V= 1,1115 LgO^'^^. 0,1 115 

for Lc > 0,0228 

V = 4,0Lc 

for 0,0228> 

Linear transfer characteristics 
i.e. V = U 



reserved 



In the case that sequence_display_extensionO is not present in the bitstream or colour_description is zero 
the transfer characteristics are assumed to be those corresponding to transfer.characteristics having the 
value 1. 
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matrix.coefficieiits — This 8-bit integer describes the matrix coefiScients used in deriving luminance and 
chrominance signals from the green, blue, and red primaries, and is defined in Table 6-9. 

In this table: 

£' Y is analogue with values between 0 and 1 

E'PB and E'pR are analogue between the values -0,5 and 0,5 

£'R, £'G ^'B ^ analogue with values between 0 and 1 

Y, Cb and Cr are related to £'y> H'PB E'pR by the following formulae. 

Y = (219*E'y)+16. 

Cb = (224*E'pB) + 128. 

Cr = (224* E'PR) +128. 

NOTE - The decoding process given by this specification limits output sample values for Y, Cr and 
Cb to the range [0:255]. Thus sample values outside die range implied by the above 
equations may occasionally occur at the output of the decoding process. In particular the 
sample values 0 and 255 may occur* 
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Table 6-9. Matrix Coefficients 



Value 



0 
1 



3 
4 



8-255 



Matrix 



(forbidden) 

Recominendation ITU-R BT.709 

E Y = 0,7154 E G + 0,0721 Eg + 0,2125 E r 

E'pB = -0,386 E'g + 0,500 E'g -0,115 E'r 

E'pR = -0,454 E'g - 0,046 Eg + 0,500 E'r 

Unspecified Video 

Image characteristics are unknown. 

reserved 
FCC 

E'y = 0,59 E'g + 0,1 1 E b + 0,30 E'r 
E PB = -0,331 E'g + 0,500 E'b -0,169 E'r 
E PR -0,421 E'g - 0,079 Eg + 0,500 E'r 
Recommendation ITU-R BT.470-2 System B, G 
E'y - 0,587 E'g + 0,1 14 E'b + 0,299 E'r 

E PB = -0,331 E'g + 0,500 E b -0,169 E r 
E PR = -0,419 E G - 0,081 E b + 0,500 E r 
SMPTE 170M 

E Y = 0,587 E G + 0,1 14 E'b + 0,299 E'r 
E PB = -0,331 E'g + 0,500 E b -0,169 E'r 
E PR = -0,419 E'g - 0,081 E'b + 0,500 E r 
SMPTE 240M(1987) 
E Y = 0,701 E'g + 0,087 E b + 0,212 E r 

E PB = -0,384 E'g + 0,500 E b ^,1 16 E'r 
E PR = -0,445 E'g - 0,055 E b + 0,500 E'r 
reserved 



In the case that sequence.display.extraisionO is not present in the bitstream or colour_description is zero 
the matrix coefficients are assumed to be those corresponding to matrix.coeffidcnts having the value 1. 

display_horizontal.size — See display_verticaLsize. 

display^vertical^size — display^orizontaLsize and display_vertical_size together define a rectangle 
which may be considered as the 'Hntended display's" active region. If this rectangle is smaller than the 
encoded firame size then the display process may be expected to display only a portion of the encoded 
fi^e. Conversely if the display rectangle is larger than the encoded fi^e size then the display process 
may be expected to display the reconstructed fi*ames on a portion of the display device rather than on the 
whole display device. 
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display_horizontal_size shall be in the same units as horizontal^size (samples of the encoded frames). 

display_vertica]_size shall be in the same units as vertical.size (lines of the encoded frames). 

display_horizonta]_size and display_verticaLsize do not affect the decoding process but may be used by 
the display process that is not standardised in this specification. 

63.7 Sequence scalable extension 

It is a syntactic restriction that if a sequence.scalable.extoisionO is present in the bitstream following a 
given sequence.extensionO then sequence_scalable_extensionO shall follow every other occurrence of 
sequence_extensionO. Thus a bitstream is either scalable or it is not scalable. It is not possible to mix 
scalable and non-scalable coding within a sequence. 

scalable_mode — The scalable_mode indicates the type of scalability used in the video sequence. If no 
sequence_scalable_extensionO is present in the bitstream then no scalability is used for that sequence. 
scalable_mode also indicates the macroblock_type tables to be used. However in the case of spatial 
scalability if no picture_spatial_scalable_extensionO is present for a given picture then that picture shall 
be decoded in a non-scalable manner (i.e. as if sequence_scalable_extensionO had not been present). 



Table 6-10. Definition of scalable.mode 



scalable^mode 


Meaning 


pictiire_spatial_scalable- 
_extensionO 


macroblock_type tables 


sequence_scalable_e: 

00 

01 

10 

11 


KtensionQ not present 
data partitioning 
spatial scalability 

SNR scalability 
temporal scalability 


present 
not present 


B-2, B-3 and B-4 
B-2, B-3 and B-4 
B-5, B-6 and B-7 
B-2, B-3 and B-4 
B-8 

B-2, B-3 and B-4 



layerjd ~ This is an integer wiiich identifies the layers in a scalable hierarchy. The base layer always 
has layerjd = 0. However the base layer of a scalable hierarchy does not carry a 
sequence_sca]able_extensionO and hence layer.id, except in the case of data partitioning. Each successive 
layer has a layer_id which is one greater than the layer for which it is an enhancement. 

In the case of data partitioning layerjd shall be zero for partition zero and layerjd shall be one for 
partition one. 

lower Jayer_jirediction_horizontal_size — this is a 14-bit integer indicating the horizontal size of the 
lower layer fi^une which is used for prediction. This shall contain the value contained in horizontal_size 
(horizontal_size_value and horizontal_size_extension) in the lower layer bitstream. 

lowerJayerjprediction_vertical_size — this is a 14-bit integer indicating the vertical size of the lower 
layer frame which is used for prediction. This shall contain the value contained in vertical_size 
(vertical_size_value and vertical_size_extension) in the lower layer bitstream. 

horizontal_subsampling_factor_m — This afiects the spatial scalable upsampling process, as defined in 
7.7.2. The value zero is forbidden. 

horizontal_subsampling_factor_n — This affects the spatial scalable upsampling process, as defined in 
7.7.2. The valxie zero is forbidden. 
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vertical_siibsainp!lng_factor_m — This afiEects tbe spatial scalable i^sampling prcx^ss, as defined ia 
7.7.2. The value zero is fbrbidden. 

verticaI_subsampliiig_factor_n ~ This affects the spatial scalable upsampling process, as defined in 
7.7.2. The value zero is forbidden. 

pictiire_mi]x_enabie — If set to 1, picture_mux_order and picture_mux_&ctor are used for 
r^ultiplexing prior to display. 

iniULtOLprogressive.seqaence — This flag ^en set to indicates that the decoded pictures 
corresponding to the two layers shall be temporally multiplexed to generate a progressive sequence for 
display. When the temporal multiplexing is intended to generate an interlaced sequence this flag shall be 

pictttre.miuLorder — It denotes number of enhancement layer pictures prior to the first base layer 
picture. It thus assists remultiplexing of pictures prior to display as it contains infinrmation for inverting 
the demultiplexing performed at the encoder. 

picture_mux_factor — It denotes number of enhancement layer pictures between consecutive base layer 
pictures to allow correct remultiplexing of base and enhancement layers for display. It also assists in 
remultiplexing of pictures prior to display as it contains infinmation for inverting the temporal 
demultiplexing performed at the encoder. The value *000' is reserved. 



6.3.8 Group of pictures header 

group.stait_code - The group_start_code is the bit string '000001B8' in hexadecimal. It identifies the 
beginning of a group of pictures header. 

time.code — This is a 25-bit integer containing the following: drop_fi'ame__flag, time_code__hours» 
time_code_minutes, marker_bit, time_code_seconds and time_code_pictures as shown in Table 6-11. The 
parameters correspond to those defined in the I£C standard publication 461 for 'time and control codes 
for video tape recorders" (see Bibliography, Annex G). The time code refers to the first picture after the 
group of pictures header that has a temporal_reference of zero. The drop.fi-ame.flag can be set to either 
'0' or 'r. It may be set to T only if the fi'ame rate is 29,97Hz. If it is *0' then pictures are counted 
assuming rounding to the nearest integral number of pictures per second, for example 29,97Hz would be 
rounded to and counted as 30Hz. If it is then picture numbers 0 and 1 at the start of each minute, 
except minutes 0, 10, 20, 30, 40, 50 are omitted fi*om the coimt. 

NOTE - The information carried by time_code plays no part in the decoding process. 



Table 6-11 — time code 



time_code 


range of value 


No. of bits 


Mnemonic 


drop_frame_flag 




1 


uimsbf 


time_code_hours 


0-23 


5 


uimsbf 


time_code_minutes 


0-59 


6 


uimsbf 


marker_bit 


1 


1 


bslbf 


time_code_$econds 


0-59 


6 


uimsbf 


time_code_pictures 


0-59 


6 


uimsbf 



closed_^op — This is a one-bit flag >;sdiich indicates the nature of the predictions used in the first 
consecutive B-pictures (if any) immediately following the first coded I-jframe following the group of 
picture header . 
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closed_gop is set to 'r to indicate that these B-pictures have been ^coded using only badcward 
prediction or intra coding. 

This bit is provided for use during any editing \^ch occurs after encoding. If the previous pictures have 
been removed by editing, broken_Unk may be set to so that a decoder may avoid displaying these B- 
Pictures foUowmg the first I-Picture following the group of picture header. However if the closed_gop bit 
is set to T, then the editor may choose not to set the te-okenjink bit as these B-Pictures can be correctly 
decoded. 

broken_link -- This is a one-bit flag vAnch shall be s^ to '0' during encoding. It is set to T to indicate 
that the first consecutive B-Pictures (if any) inmiediately following the first coded I-fi^e following the 
group of picture header may not be correctly decoded because the reference fitime ^ch is used for 
prediction is not available (because of the action of editing). 

A decoder may use this flag to avoid displaying fi-ames that cannot be correctly decoded. 
63.9 Picture header 

picture_start_code — The picture_start_code is a string of 32 bits having the value 00000100 in 
hexadecimal. 

temporal_reference — The temporal__reference is a 10-bit unsigned integer associated with each coded 
picture. 

The following specification applies \^en low_delay is equal to zero. 

When a firame is coded as two field pictures, the temporal_reference associated with each coded picture 
shall be the same. The temporal_reference of each coded fi'ame shall increment by one modulo 1024 
vfhcn examined in display order at the output of the decoding process, except \^en a group of pictures 
header occurs. After a group of pictures header, the temporal_reference of the first fi'ame in display order 
shall be set to zero. 

The following specification applies when low_delay is equal to one. 

When low_delay is equal to one, there may be situations where the VBV buflFer shall be re-examined 
several times before removing a coded picture (referred to as a big picture) from the VBV buffer. 

If there is a big picture, the temporaLreference of the picture immediately following the big picture shall 
be equal to the temporal_reference of the big picture incremented by N+1 modulo 1024, where N is the 
number of times that the VBV buffer is reexamined (N>0). If the big picture is immediately followed by 
a group of pictures header, the temporal.reierence of the first coded picture after the group of pictures 
header shall be set to N. 

The temporal_reference of a picture that does not immediately follow a big picture follows the 
specification for the case ^en low delay is equal to zero. 

NOTE- If the big picture is the first field of a fi'ame coded with field pictures, then the 
temporal_reference of the two field pictures of that coded firame are not identical. 

picture_coding_type — The picture_coding_type identifies vdiether a picture is an intra-coded picture(I), 
predictive-coded picture(P) or bidirectionally predictive-coded picture(B). The meaning of 
picture_coding_type is defined in Table 6-12. 

NOTE- Intra-coded pictures with only DC coefficients (D-pictures) that may be used in 
ISO/IEC 1 1 172-2 are not supported by this specification. 
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Table 6-12 — pictore^coding^type 



pictiire.coding_type 


coding method 


000 


forbidden 


001 


intra-coded (I) 


010 


predicdve-coded (?) 


Oil 


bidirectianally-predictive-coded (B) 


100 


shall not be used 




(dc intra-coded (D) in ISO/IEC11172- 




2) 


101 


reserved 


110 


reserved 


111 


reserved 



vbv_delay — The vbv_delay is a 16-bit unsigned integer. In all cases other then when vbv_delay has the 
value hexadecimal FFFF, the value of vbv_delay is the number of periods of a 90 kHz clock derived from 
the 27 ^JSHz system clock that the VBV shall wait after receiving the final byte of the picture start code 
before decoding the picture, vbv^delay shall be coded to represent the delay as specified above or it shall 
be coded with the value hexadecimal FFFF. If any vbv_delay field in a sequence is coded with 
hexadedmal FFFF thai all of them shall be coded with this value. If vbv_delay takes the value 
hexadecimal FFFF, input of data to the VBV buffer is defined in C.3.2 of annex C, otherwise input to the 
VBV bufier is defined in clause C.3. 1 . 

If low_delay is equal T and if the bitstream contains big pictures, the vbv_delay values encoded in the 
picture_beaderO of big pictures may be wrong if not equal to hexadecimal FFFF. 

NOT£ - There are several ways of calculating vbv.delay in an encoder. 

In all cases it may be calculated by noting that the end-to-end delay through the encoder and 
decoder buffer is constant for all pictures. The encoder is capable of knowing the delay 
experienced by the relevant picture start code in the encoder buffer and the total end-to-end 
delay. Therefore the value encoded in vbv_delay (the decoder buffer delay of the picture start 
code) is calculated as the total delay less the delay of the corresponding picture start code in the 
encoder buffer measured in periods of a 90 kHz clock derived from the 27 MHz system clock. 

Alternatively, for constant bitrate operation only, vbv.delay may be calculated from the state of 
the VBV as follows: 

vbv.delayn = 90 000 * Bn* / R 

where: 
n>0 

Bn* = VBV occupancy, measured in bits, immediately before removing picture n from the 

buffer but after removing any header(s), user data and stuffing that immediately 
precedes the data elements of picture n. 

R = the actual bitrate (i.e. to fiill accuracy rather than the quantised value given by bitjrate 
in the sequence hauler.) 

An equivalent method of calculating vbv^delay for variable bitrate streams can be derived from 
the equation in C.3.1. This will be in the form of a recurrence relation for the vbv.delay given 
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the previous vbv_delay, the decoding times of the oirrent and previous pictures, and the number 
of bytes in the previous picture. This method can be applied if, at the time vbv_delay is encoded, 
the average bitrate of the transfer of the picture data of the previous picture is known. 

luU_j)eLforward_vector - This flag that is used in ISO/IEC 1 1 172-2 is not used by this specification. It 
shall have the value '0'. 

forward.f_code — This 3 bit string (which is used in ISO/IEC 1 1 172-2) is not used by diis specification. 
It shall have die value '111'. 

fiill_peLbackward_vector - This flag that is used in ISO/IEC 1 1 172-2 is not used by this specification. 
It shall have the value '0'. 

backward_f_code - This 3 bit string (which is used in ISO/IEC 11172-2) is not used by this 
specification. It shall have the value '111'. 

extra_bit_picture — A bit indicates the presence of the following extra informaticm. If extrajrit^icture 
is set to 'r, extra.infbrmation^picture will follow it. If it is set to '0', there are no data following it. 
extra_bit_j)icture shaU be set to '0', the value ' T is reserved for possible future extensions defined by ITU- 
T|ISO/IEC. 

extra_information_picture — Reserved. A decoder conforming to this specification that encounters 
extra_infomiation_picture in a bitstream shall ignore it (i.e. remove fi'om the bitstream and discard). A 
bitstream conforming to this specification shall not contain this syntax element. 

63 AO Picture coding extension 

f_code|s][t] — A 4 bit unsigned integer taking values 1 through 9, or 15. The value zero is forbidden and 
the values 10 througfh 14 are reserved It is used in the decoding of motion vectors, see 7.6.3.1. 

In an I-picture in which concealment_jnotioiL.vectors is zero C.code[s][t] is not used (since motion vectors 
are not used) and shall take the value 1 5 (all ones). 

Similarly, in an I-picture or a P-picture f_code[l][t] is not used in the decoding process (since it refers to 
backwards motion vectors) and shall take the value 15 (all ones). 

See Table 7-7 for the meaning of the indices; s and t. 

mtra_dc^recision - This is a 2-bit integer defined in the Table 6-13. 



Table 6-13 Intra DC precision 



intra_dc_precision 


Precision (bits) 


00 


8 


01 


9 


10 


10 


11 


11 



The inverse quantisation process for the Intra DC coefficients is modified by this parameter as explained 
in 7.4.1. 

picture_structure — This is a 2-bit integer defined in the Table 6-14. 
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Table 6-14 Meaning of picture^structure 


pictnre_structiure 


Meaning 


00 


reserved 


01 


Top Field 


10 


Bottom Field 


11 


Frame picture 



When a frame is encoded in die form of two field pictures both fields must be of the same 
picture_codiiig_type, except where the first encoded field is an I-picture in ^^ch case the second may be 
either an I-picture or a P-picture. 

The first encoded field of a fi^e may be a top-field or a bottom field, and the next field must be of 
opposite parity. 

When a fi^e is encoded in the form of two field pictures the following syntax elonents may be set 
indq>endendy in each field picture: 

£.code[0][01, Lcode[0][l] 

f_code[l][0], f_code[l][l] 

• intra_dc_precision, CQncealment_motion_vectors, q^scalejtype 

• intra_ylc_format, altemate_scan 



top_field_first — The meaning of this element depends upon picture_structure, progressive_sequence and 
repeat_first_field. 

If prpgressive^sequCTce is equal to '0\ this flag indicates ^at field of a reconstructed frame is output 
first by the decoding process: 

In a field picture top_field_first shall have the value '0', and the only field output by the decoding process 
is the decoded field picture. 

In a fi-ame picture top_field_first being set to T indicates that the top field of the reconstructed fi^me is 
the first field output by the decoding process. top_field_first being set to '0' indicates that the bottom field 
of the reconstructed fiiune is the first field output by decoding process 

If progressive.sequence is equal to '1', this flag, combined with repeat_first.field, indicates how many 
times (one, two or three) the reconstructed frame is output by the deoxling process. 

If repeat_first_field is set to 0, top_field_first shall be set to *0'. In this case the output of the decoding 
process corresponding to this reconstructed frame consists of one progressive frame. 

If top_field_first is set to 0 and repeat_first_field is set to '1', the output of the decoding process 
corresponding to this reconstructed fi^me consists of two identical progressive frames. 

If top_field_first is set to 1 and repeat_first_field is set to '1', the output of the decoding process 
corresponding to this reconstructed fi^me consists of three identical progressive fi^es. 

frame_pred_frame_dct — If this flag is set to '1' then only frame-DCT and frame prediction are used. In 
a field picture it shall be '0*. frame_4)red_fi"ame_dct shall be '1' if progressive_fi^me is 'T. This flag 
affects the syntax of the bitstream. 

concealment_motion_vectors — This flag has the value ' 1 ' to indicate that motion vectors are coded in 
intra macroblocks. This flag has the value '0' to indicate that no motion vectors are coded in intra 
macroblodcs. 
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q_scale.type - This flag affects the inverse quantisation pnx:ess as described in 7.4.2.2. 

intrajvlc.format - This flag afiects the decoding of transform coefBcient data as described in 7.2. 1 . 

aitemate^scan — This flag affects the decoding of transform coefSdoit data as described in 7.3. 

repeat_first_field — This flag is applicable only in a frame picture, in a field picture it shall be set to zero 
and does not affect the decoding process. 

If progressive_sequence is equal to 0 and progressive_frame is equal to 0, repeat_first_field shall be zero, 
and the output of the decoding process corresponding to this reconstructed fiame consists of two fields. 

If progressive_sequence is equal to 0 and progressive_fi:ame is equal to 1 : 

If this flag is set to 0, the output of the decoding process corresponding to this reconstructed fimne 
consists of two fields. The first field (top or bottom field as identified by tq)_field_first) is followed by the 
other field. 

If it is set to 1, the output of the decoding process corresponding to this reconstructed fi^une consists of 
three fields. The first field (top or bottom field as identified by top_field_first) is followed by the other 
field, then the first field is repeated. 

If progressive_sequence is equal to 1 : 

If this flag is set to 0, the ou^ut of the decoding process corresponding to this reconstructed fi'ame 
consists of one frame. 

If it is set to 1, the ou^ut of the decoding process corresponding to this reconstructed fi'ame consists of 
two or three frames, depending on the value of tq)_field_first. 

chroma_420_type - If chroma_format is "4:2:0", the value of chronuL420_type shall be the same as 
progressive_frame; else chroma_420_type has no meaning and shall be equal to zero. This flag exists for 
historical reasons. 

progressive_frame - If progressive_frame is set to 0 it indicates that the two fields of the frame are 
interlaced fields in ^^ch an interval of time of the field period exists between (corresponding spatial 
samples) of the two fields. In this case the following restriction applies: 

• repeat_first_field shall be zero (two field duration). 

If progressive.frame is set to 1 it indicates that the two fields (of the frame) are actually from the same 
time instant as one another. In this case a number of restrictions to other parameters and flags in the 
bitstream apply: 

• picture_structure shall be "Frame'* 

• frame_pred_frame_dct shall be 1 

progressive_frame is used \^en the video sequence is used as the lower layer of a spatial scalable 
sequence. Here it affects the up-sampling process used in forming a prediction in the enhancement layer 
from the lower layer. 

composite_display.flag — This flag is s^ to 1 to mdicate that the following fields that are of use vdien 
the input pictures have been coded as (analogue) composite video prior to encoding into a bitstream that 
complies with this specification. If it is set to 0 then these parameters do not occur in the bitstream. 

The information relates to the picture that immediately follows the extension. In the case that this picture 
is a fi^me picture the information relates to the first field of that frame. The equivalent information for 
the second field may be derived (there is no way to rq)resent it in the bitstream). 
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NOTES 

1 ' The various syntactic elements that are included in the bitstream if composite_display_flag is 

* 1 ' are not used in the decoding process. 

2 repeat_first_field will cause a composite video field to be repeated out of the 4-field or 8-field 
sequence. It is recommended that repeat_first_field and composite_display_flag are not both 
s^ simultaneously. 

y_axis — A 1-bit integer used only when the bitstream represents a signal that had previously beoi 
encoded according to PAL systems, v.axis is set to 1 on a positive sign, v_axis is set to 0 otherwise. 

field_sequence — A 3-bit integer v/h\ch defines the number of the field in the eight field sequence used in 
PAL systems or the four field sequence used in NTSC systems as defined in the Table 6-15. 



Table 6-15 Definition of field.sequence. 



field 
sequence 


frame 


field 


000 


1 


1 


001 


1 


2 


010 


2 


3 


oil 


2 


4 


100 


3 


5 


101 


3 


6 


110 


4 


7 


111 


4 


8 



sub.carrier — This is a 1-bit integer. Set to 0 means the sub-carrier/line firequency relationship is correct. 
When set to 1 the relationship is not correct. 

burst_amplitude — This is a 7-bit integer defining the burst amplitude (for PAL and NTSC only). The 
amplitude of the sub-carrier burst is quantised as a Recommendation ITU-RBT.601 luminance signal, 
with the MSB omitted 

sub.carrierjihase — This is an 8-bit integer defining the phase of the reference sub-carrier at the field- 
synchronisation datum with respect, to field start as defined in Recommendation ITU-RBT.470. See 
Table 6-16. 



Table 6-16 Definition of sub_carrier_phase. 


sub_carrier_pliase 


Phase 


0 


([360O-256] • 0) 


1 


([3600^256] • 1) 


ft • • 

255 


• • » 

([360^256] * 255) 
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63.11 Quant matrix extension 

Each quantisation matrix has a default set of values. When a sequenceJieader_code is decoded all 
matrices shall be reset to their default values. User defined matrices may be downloaded and this can 
occur in a sequence JieaderQ or in a quant_matrix.extensionO. 

With 4:2:0 data only two matrices are used, one for intra blocks the other for non-intra blocks. 

With 4:2:2 or 4:4:4 data four matrices are used. Both an intra and a non-intra matrix are provided for 
both luminance blocks and for chrominance blocks. Note however that it is possible to download the same 
user defined matrix into both the luminance and chrominance matrix at the same time. 

The default matrix for intra blocks (both luminance and chrominance) is: 



8 


16 


19 


22 


26 


27 


29 
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The default matrix for non-intra blocks (both liuninance and chrominance) is: 
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loadJntra_quantiser_matrix — This is a one-bit flag which is set to M' if intra__quantiser_matrix 
follows. If it is set to '0' then there is no change in the values that shall be used. 

intra_quantiser_matrix — This is a list of sixty-four 8-bit unsigned integers. The new values, encoded in 
the de&ult zigzag scanning order as described in 7.3.1 , replace the previous values. The first value shall 
always be 8. For all of the 8-bit unsigned integers, the value zero is forbidden. With 4:2:2 and 4:4:4 data 
the new values shall be used for both the luminance intra matrix and the chrominance intra matrix. 
However the chrominance intra matrix may subsequently be loaded with a dififer^t matrix. 

load_non_intra_quantiser_matrix - This is a one-bit flag which is set to '1' if 
non_intra_quantiser_matrix follows. If it is set to '0' then there is no change in the values that shall be 
used. 

non_intra_quantiser_matriz — This is a list of sixty-four 8-bit unsigned integers. The new values, 
encoded in the default zigzag scanning order as described in 7.3.1, replace the previous values. For all the 
8-bit unsigned integers, the value zero is forbidden. With 4:2:2 and 4:4:4 data the new values shall be 
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used for both the liiminaiice non-intra matrix and the chrommance non-intra matrix. However the 
dirominance non-intra matrix may subsequeotly be loaded with a different matrix. 

load_chroma_intra_qiiantiser_matrix ~ This is a one-bit flag which is set to '1' if 
dirQma_intra_quantiser_matrix follows. If it is set to '0* then there is no change in the vahies that shall 
be used. If chroma^format is *'4:2:0" this flag shall take the vahie '0'. 

chroiiia_iiitra_qiiantiser_matrix — This is a list of sixty-four 8-bit unsigned integers. The new values, 
encoded in the default zigzag scanning order as described in 7.3.1, replace the previous values. The first 
value shall always be 8. For all of the 8-bit unsigned integers, the value zero is forbidden. 

load_chroma_non_uitra.qiiantiser_matriz — This is a one-bit flag which is set to 'V if 
chrQma_noii_intra_.quantiser_matrix follows. If it is set to '0' then there is no diange in the values that 
shall be used. If chroma.format is '*4:2:0" this flag shall take the value *0'. 

chroma_nonJntra_quantiser_matrix — This is a list of sixty-four 8-bit unsigned integers. The new 
values, encoded in the default zigzag scanning order as described in 7.3.1, replace the previous values. For 
all the 8-bit unsigned integers, the value zero is forbidden. 

63.12 Picture display extension 

This specification does not define the display process. The information in this extension does not affect 
the decoding process and may be ign^ed by decoders that conform to this specification. 

The picture display extension allows the position of the display rectangle whose size is specified in 
sequence_display_extensionO to be moved on a picture-by-picture basis. One application for this is the 
implementation of pan-scan. 

frame_centre_horizontal_o!rset — This is a 16-bit signed integer giving the horizontal offiet in units of 
1/1 6th sample. A positive value shall indicate that the centre of the reconstructed fi'ame lies to the right of 
the centre of the display rectangle. 

frame_centre_vertical_ofrset ~ This is a 16-bit signed integer giving the vertical offset in imits of 1/1 6th 
sample. A positive value shall indicate that the centre of the reconstructed firame lies below the centre of 
the display rectangle. 

The dimensions of the display rectangular region are defined in the sequence_display_extensionO' The 
coordinates of the region within the coded picture are defined in the picture_display_extensionO- 

The centre of the reconstructed frame is the centre of the rectangle defined by horizontal_size and 
vertical_size. 

Since (in the case of an interlaced sequence) a coded picture may relate to one, two or three decoded fields 
the picture_display_extensionO may contain up to three offsets. 
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The number of frame centre oflfeets in the picture_display_extensionO shall be defined as follows: 

if ( progressive_sequence = 1) { 
if ( repeat_first_field = T ) { 
if(top_field_first = M') 

number_of_frame_centre_ofl&ets == 3 

else 

number_of_frame_centre_offeets = 2 

} else { 

number_oCfranie_centre_ofl6ets = 1 

} 

} else { 

if (picture_structure = "field") { 

number_of_firame_centre_ofi&ets = 1 
} else { 

if (repeat_first_field = T ) 

niimber_ot.frame_ccntre_ofl6ets = 3 

else 

number„of_.fi^e_centre_oflfsets = 2 

} 

} 

A pictiire_display__extensionO shall not occur unless a sequence.display.extensionO followed the 
previous sequence_headerO- 

In the case that a given picture does not have a pictureLdisplay^extensionO then the most recently 
decoded frame centre offset shall be used. Note that each of the missing firame coitre of&ets have the 
same value (even if two or three frame centre ofi&ets would have been contained in the 
picture_display_extensionO had been present). Following a sequence^^eaderQ the value zero shall be 
used for all frame centre ofifeets until a picture_display_extaisionO defines non-zero values. 
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Figure 6-16 illustrates the picture display parameters. As shown the frame centre offsets contained in the 
picture_display_extensionO shall specify the position of the centre of the reconstructed frame from the 
centre of the display rectangle. 

NOTES - 

1 The display rectangle may also be larger than the reconstructed frame. 

2 Even in a field picture the fiBme_caitre_vertical_of&et still represents the ofi&et of the 
centre of the fi^me in of a frame line (not a line in the field). 

3 In the example of Figure 6-16 both frame_centre_horizontal_offeet and 
frame_centre_vertical_offset have negative values. 




Figure 6-16. Frame centre olTset parameters 



63.12.1 Pan-scan 

The frame centre offsets may be used to implement pan-scan in \^ch a rectangular region is defined 
which may be panned around the entire reconstruaed frame. 

By way of example only; this facility may be used to identify a 3/4 aspect ratio window in a 9/16 coded 
picture format. This would allow a decoder to produce usable pictures for a conventional definition 
television set from an encoded format intended for enhanced definition. The 3/4 aspect ratio region is 
intended to contain the '*most interesting" region of the picture. 

The 3/4 region is defined by display_hori2Qntal_size and display_vertical_si2e. The 9/16 frame size is 
defined by horizontal_size and vertical_size. 

63.13 Picture temporal scalable extension 
NOTE - See also 7.9. 

reference_select_code — This is a 2-bit code that identifies reference fi^es or reference fields for 
prediction depending on the picture type. 

forward_temporal_reference — A 10 bit unsigned integer value which indicates temporal reference of 
the lower layer frame to be iised to provide the forward prediction. If the lower layer indicates temporal 
reference with more than 10 bits, the least significant bits are encoded here. If the lower layer indicates 
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temporal reference with fewer than 10 bits, all bits are encoded here and the more significant bits shall be 
set to zero. 

backward_tempora]_reference ~ A 10 bit unsigned integer value which indicates temporal reference of 
the lower layer frame to be used to provide the backward prediction. If the lower layer indicates temporal 
referrace with more than 10 bits, the least significant bits are encoded here. If the low^ layer indicates 
temporal reference with fewer than 10 bits, all bits are encoded here and the more significant bits shall be 
set to zero. 

63.14 Picture spatial scalable extension 

lower Jaycr_temporal_referencc — A 10 bit unsigned integer value vMch indicates temporal reference 
of the lower layer frame to be used to provide the prediction. If the lower layer indicates temporal 
reference with more than 10 bits, the least significant bits are encoded here. If the lower layer indicates 
temporal reference with fewer than 10 bits, all bits are encoded here and the more significant bits shall be 
set to zero. 

lower Jayer_hori2ontal_ofTset ~ This 15 bit signed (twos complement) integer specifies the horizontal 
offeet (of the top left hand comer) of the upsampled lower layer frame relative to the enhancement layer 
picture. It is expressed in units of the enhancement layer picture sample width. If the chrominance 
format is 4:2:0 or 4:2:2 then this parameter shall be an even numba*. 

lower Jayer_vertical_offsct - This 15 bit signed (twos complement) integer specifies the vertical ofifeet 
(of the top left hand comer) of the upsampled lower layer picture relative to the enhancement layer 
picture. It is expressed in units of the enhancement layer picture sample height. If the chrominance 
format is 4:2:0 then this parameter shall be an even number. 

spatial_temporal_weight_code_tabIe_index - This 2 bit integer mdicates ^ch table of spatial 
temporal weight codes is to be used as defined in 7.7. Permissible values of 
spatial_tCTiporaLweight_code_tableJndex are defined in Table 7-21. 

lower Jayer_progressive_frame - This flag shall be set to 0 if the lower layer fi^e is interlaced and 
shall be set to T if the lower layer frame is progressive. The use of this flag in the spatial scalable 
iq)sampling process is defined in 7.7. 

lower_layer_deinterlaced_fleId_select — This flag affects the spatial scalable upsampling process, as 
defined in 7.7. 

63.15 Copyright extension 

extension_start_codeJdentifier - This is a 4-bit integer yvhich identifies the extension. See Table 6-2. 

copyright.flag — This is a one bit flag. When copyright_flag is set to '1', it indicates that the source 
video material encoded in all the coded pictures following the copyright extension, in coding order, up to 
the next copyright extension or end of sequence code, is copyrighted. The copyrightjdentifier and 
copyrightjiumber ideintify the copyrighted work. When copyri^t_flag is set to '0', it does not indicate 
whether the source video material encoded in all the coded pictures following the copyright extension, in 
coding order, is copyrighted or not 

copyrightjdentifier — This is a 8-bit integer which identifies a Registration Authority as designated by 
ISO/IEC JTC1/SC29. Value zero indicates that this information is not available. The value of 
copyright_jiumber shall be zero whra cq)yright_identifier is equal to zero. 

When copyright_flag is set to '0', copyright_identifier has no meaning and shall have the value 0. 

originaI_or_copy — This is a one bit flag. It is set to ' T to indicate that the material is an original, and 
set to *0' to indicate that it is a copy. 
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reserved — This is a 7-bit integer, reserved for foture extension. It shall have the value zero. 

copyright_niimbcr J — This is a 20-bit integer, representing bits 44 to 63 of cop>Tight_number. 

copyright_nomber_2 — This is a 22-bit integer, representing bits 22 to 43 of copyright_number. 

copyright_niimber_3 — This is a 22-bit integer, representing bits 0 to 21 of copyright_number. 

copyright.nnmber — This is a 64-bit integer, derived from copyright_nuniber_l, copyright_nuniber_2, 
and copyright_nimiber_3 as follows: 

copyright_number = (copyright_number_l « 44) + (cop>Tight_number_2 « 22) + copyright_number_3. 

The meaning of copyright_number is defined only when copyright__flag is set to In this case, the 
value of copyright.number identifies uniquely the copyrighted work marked by the copyrighted extension 
and is provided by the Registration Authority identified by copyrigbt_identifier. The value 0 for 
copyright_number indicates that the identification number of the copyrighted work is not available. 

When copyright_flag is set to '0', copyright.number has no meaning and shall have the value 0. 



63.16 Slice 

slice_start_code - The slice.start_code is a string of 32-bits. The first 24-bits have the value 000001 in 
hexadecimal and the last 8-bits are the slice_vertical_position having a value in the range 01 through AF 

hexadecimal inclusive. 

slice_vertical_posltion — This is given by the last eight bits of the slice_start_code. It is an imsigned 
integCT giving the vertical position in macroblock units of the first macroblock in the slice. 

In large pictures (when the vertical size of the frame is greater than 2800 lines) the slice vertical position 
is extended by the slice_vertical.4>ositioii_ezteiision. 

The nutcroblock row may be calculated as follows: 

if ( vertical_size > 2800 ) 

mb_row = (slice_VCTtical_position_extension « 7) + slice„vertical_position - 1; 

else 

mb_row = slice_vertical ^position - 1 ; 

The slice_vertical_position of the first row of macroblocks is one. Some slices may have the same 
slice_vertical.4)Osition, since slices may start and finish anywhere. The maximum value of 
slice_vertical_position is 175 xmless slice_vertical_j)Ositionjextension is preset in which case 
slice.vertical^osition shall be in the range [1:128]. 

priority_breakpoint — This is a 7-bit integer that indicates the point in the syntax where the bitstream 
shall be partitioned. The allowed values and their semantic interpretation is given in Table 7-30 

priority^breakpoint shall take the value zero in partition 1 . 

quantiser_scale_code — A 5 bit unsigned integer in the range 1 to 31 . The decoder shall use this value 
until another quantiser_scale_code is encountered either in sliceQ or macroblockQ- The value zero is 
forbidden. 

intra.slice_flag - This flag shall be set to T to indicate the presence of intra^slice and reserved.bits in 
the bitstream. 

intra.slice — This flag shall be set to '0' if any of the macroblocks in the slice are non-intra macroblocks. 
If all of the macroblocks are intra macroblocks then intra_slice may be set to *V. intra.slice may be 
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omitted from the bitstream (by setting intra_slice_flag to '0') in v/inch case it shall be assumed to have the 
vahie zero. 

intra_slice is not used by the decoding process. intra_slice is intended to aid a DSM application in 
performing FF/FR (see D.12). 

reserved.bits ~ This is a 7 bit integer, it shall have the value zero, other values are reserved. 

extra_bit_slice — This flag indicates the presence of the following extra information. If extra__bit_slice is 
set to '1% extra_information__slice will follow it. If it is sa to '0*, there are no data following it. 
extra_bit_slice shall be set to '0', the value '1' is reserved for possible future extensions defined by ITU- 

Tpso/mc. 

extra_iiiformation_slice ~ Reserved. A decoder conforming to this specification that encounters 
extra_information_slice in a bitstream shall ignore it (i.e. remove from the bitstream and discard). A 
bitstream conforming to this specification shall not contain this syntax element. 

63.17 Macroblock 

NOTE- '•macroblock__stuffing" which is supported in ISO/IECl 1172-2 shall not be used in a 
bitstream defined by this specification. 

macroblock^escape — The macroblocl^escape is a fixed bit-string '0000 0001 000' v/bich is used when 
the difference b^ween macroblock_address and previous_macroblock.address is greater than 33. It causes 
the value of macrobloclL.address_increment to be 33 greater than the value that will be decoded by 
subsequent macroblock;_escape and the macroblocl^address_increment codewords. 

For example, if there are two macroblock^escape codewords preceding the 
macroblock_address_increment, then 66 is added to the vahie indicated by 
macroblock_address_increment. 

macroblock_address_increment — This is a variable length coded integer coded as per Annex B 
Table B-1 which indicates the difiPerence between macroblock_address and previous_macroblocK.address. 
The maximum vahie of macroblock^address^increment is 33. Values greater than this can be encoded 
using the macroblock^escape codeword. 

The macroblock_address is a variable defining the absolute position of the current macroblock. The 
macroblock.address of the top-left macroblock is zero. 

The previous_niacroblock_address is a variable defining the absolute position of the last non-skipped 
macroblock (see 7.6.6 for the definition of skipped macroblocks) except at the start of a slice. At the start 
of a slice previous^inacroblock^address is reset as follows: 

previousjaiacroblocK.address = (mb_row * mbjwidth) -1 

The horizontal spatial position in macroblock units of a macroblock in the picture (mb.column) can be 
computed from the macroblock_address as follows: 

mb_coluinn = macroblod^address % mb.width 
\diere mb.width is the number of macroblocks in one row of the picture. 

Except at the start of a slice, if the value of macroblock^address recovered from 
macroblock^address_increment and the macroblock_escape codes (if any) difiers firom the 
previo;is_macroblock^address by more than one then some macroblocks have been skipped. It is a 
requirement that: 
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• There shall be no skipped macroblocks m I-pictures except when 

either picture_spatial_scalable_extensionO follows the picture_headerO of the current picture. 

or sequence^scalable.extensionO is present in the bitstream and scalable_mode = "SNR 
scalabilit/'. 

• The first and last macroblock of a slice shall not be skipped. 

• In a B-picture there shall be no skipped macroblocks immediately following a macroblock in 
which macroblock^intra is one. 

63.17.1 Macroblock modes 

macroblock_type — Variable length coded indicator \^ich indicates the method of coding and content of 
the macroblock according to the Tables B-2 through B-8, selected by picture_coding_type and 
scalable_mode. 

macroblock.quaiit — Derived from macroblock.type according to the Tables B-2 through B-8. This is set 
to 1 to indicate that quantiser_scale_oode is present in the bitstream. 

macrobiock.motioii.forward ~ Derived from niacrobIock_type according to the Tables B-2 through B- 
8. This flag affects the bitstream syntax and is used by the decoding process. 

niacroblock_motion_backward ~ Derived from macroblock_type according to the Tables B-2 through 
B-8. This flag affects the bitstream syntax and is used by the decoding process. 

macroblock_pattern — Derived from macroblock_type according to the Tables B-2 through B-8. This is 
set to 1 to indicate that coded_UocKj>attemO is present in the bitstream. 

macroblocK_intra — Derived from macroblod^type according to the Tables B-2 through B-8. This flag 
affects the bitstream syntax and is used by the decoding process. 

spatial_temporal_weight_code_flag - Derived from the macroblock.type. This indicates whether the 
spatial_temporal_weight_code is present in the bitstream. 

When spatial_temporal_weight_code_flag is '0' (indicating that spatiaLtemporaLweight_code is not 
present in the bitstream) the spatial__temporal_weight_class is derived from Tables B-5 to B-7. When 
spatial_temporal.weight_code_flag is M * spatial_temporal_weight_class is derived from Table 7-20. 

spatial_temporal_weight_code — This is a two bit code \^ch indicates, in the case of spatial scalability, 
how the spatial and temporal predictions shall be combined to form the prediction for the macroblock. A 
full description of how to form the spatial scalable prediction is given in 7.7. 

frame_motion_type — This is a two bit code indicating the macroblock prediction type, defined in 
Table 6-17. 

If frame_pred_frame_dct is equal to 1 then frame_|xiotion_type is omitted from the bitstream. In this case 
motion vector decoding and prediction formation shall be performed as if frame_motion_type had 
indicated "Frame-based prediction". 

In the case of intra macroblocks (in a frame picture) v/hen concealment_motion_vectors is equal to 1 
frame_motion_type is not present in the bitstream. In this case motion vector decoding and update of the 
motion vector predictors shall be performed as if frame_motioiutype had indicated 'Trame-based". See 
7.6.3.9. 
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Table 6-17 Meaning of frame_niotion_type 



code 


spatialjtemporal 


prediction type 


motion.vector 


mv__format 


dmv 




_weight_class 




.count 






00 




reserved 








01 


0,1 


Field-based 


2 


field 


0 


01 


2,3 


Field-based 


1 


field 


0 


10 


0,1,2,3 


Frame4)ased 


1 


fi*anie 


0 


11 


0,2,3 


Dual-Prime 


1 


field 


1 



field_inotion_type — This is a two bit code indicating the macroblock prediction type, defined in Table 6- 
18. 

In die case of intra macroblocks (in a field picture) when conceahnent_niotion_vectors is equal to 1 
field_motion_type is not present in the bitstream. In this case motion vector decoding and update of the 
motion vector predictors shall be performed shall be performed as if field_motion_type had indicated 
"Field-based". See 7.6.3.9. 



Table 6-18 Meaning of field.motion.t}^ 



code 


spatialjtemporal 


prediction type 


motion_vector 


mv.fonnat 


dmv 




_weight_class 




.count 






00 




reserved 








01 


0,1 


Field-based 


1 


field 


0 


10 


0,1 


16x8MC 


2 


field 


0 


11 


0 


Dual-Prime 


1 


field 


1 



dct_type ~ This is a flag indicating whether the macroblock is firame DCT coded or field DCT coded If 
this is set to 'r, the macroblock is field DCT coded 

In the case that dct^type is not present in the bitstream then the value of dctjtype (used in the remainder 
of the decoding process) shall be derived as shown in Table 6-19. 



Table 6-19. Value of dctjtype if dct.type is not in the bitstream. 



Condition 


dct_type 


picture_structure = *'field" 

framejjred_frame_dct = 1 
!(macroblock^intra || macroblock^4)attem) 

macroblock is skipped 


unused because there is no fiame/field distinction 
in a field picture. 

0 C*firame") 

unused - macroblock is not coded 
unused - macroblock is not coded 



63.17.2 Motion vectors 

motion_vector_count is derived fi-om fieldjnotion^type or fiBme_^lotionJtype as indicated in Table 6-17 
and Table 6-18. 
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my_fannat is doived from field_motion_t>pe or frame_motion_type as indicated in the Table 6-17 and 
Table 6-18. mv_format indicates if the motion vector is a field-motion vector or a frame-motion vector. 
mv_fonnat is used in the syntax of the motion vectors and in the process of motion vector prediction. 

dmv is derived from field_motion_type or frame_motion_t>pe as indicated in Table 6-17 and Table 6-18 

motion_vertica]_field_select[r]|s] — This flag indicates vAnch reference field shall be used to form the 
prediction: If motion_vertical_field_select[r][s] is zero then the top reference field shall be used, if it is 

one then the bottom reference field shall be used. (See Table 7-7 for the meaning of the indices; r and s.) 

6*3.173 Motion vector 

motion_code[r][s][t] ~ This is a variable length code, as defined in Table B-IO, which is used in motion 
vector decoding as described in 7.6.3. 1 . (See Table 7-7 for the meaning of the indices; r, s and t.) 

motion_residi]al[r][s][t] — This is an integer i^ich is used in motion vector decoding as described in 
7.6.3.1. (See Table 7-7 for the meaning of the indices; r, s and t.) The number of bits in the bitstream 
for motionjresidual[r][s][t], r_size, is derived from C_code[s][t] as follows; 

resize = f_code[s][t] - 1 

NOTE - The number of bits for both motion_residual[0][s][t] and motion_residual[l][s][t] is denoted 
by L.code[s][t]. 

dmvector[t] — This is a variable length code, as defined in Table B-11, which, is used in motion vector 
decoding as described in 7.6.3.1 . (See Table 7-7 fisr the meaning of the index; t.) 

63.17.4 Coded block pattern 

coded.blocK_pattem_420 — A variable length code that is used to derive the variable cbp according to 
Table B-9. 

coded_blocl(Lpattenul — 

coded.blocl^attem_2 — For 4:2:2 and 4:4:4 data the coded block pattern is extended by the addition of 
either a two bit or six bit fixed length code, coded_block_pattem_l or coded_block_pattem_2. Then the 
pattem_code[i] is derived using the following: 

for(i=0; i<12; I++) { 
if (macroblock_intra) 
pattem_code[i] = 1; 

else 

pattem_code[i] = 0; 

} 

if (macroblock^attem) { 

for (i=0; i<6; i++) 

if ( cbp & (l«(5-i)) ) pattem_code[i] = 1; 
if (chroma^format = "4:2:2") 
for (i=6; i<8; i++) 

if ( coded_block_pattem_l & (l«(7-i)) ) pattem_code[i] = 1; 
if (chroma_format = "4:4:4") 
for(i=8;i<12;i-H-) 

if ( coded_block_pattera_2 & (1«(1 1-i)) ) pattem_code[i] = 1 ; 

} 
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If patteni_code[i] equals to 1, i=0 to (block^count-l), then the block number i defined in Figures 6-8, 6-9 
and 6-10 is contained in this macroblock. 

The number **block_count" vMch determines the ntmiber of blocks in the macroblock is derived jfrom the 
chrominance format as shown in Table 6-20. 



Table 6-20 block_count as a function of chroma_format 



chroma_format 


block_count 


4:2:0 


6 


4:2:2 


8 


4:4:4 


12 



63.18 Block 

The semantics of blockQ are described in clause 7. 
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7 The video decoding process 

This clause specifies the decoding process that a decoder shall perform to r 
coded bitstream. 



istruct fi'ames from the 



With the exception of the Inverse Discrete Cosine Transform (IDCT) the decoding process is defined such 
that all decoders shall produce numerically identical results. Any decoding process that produces 
identical results to the process described here, by definition, complies with this specification. 

The IDCT is defined statistically in order that different implementations for this fimction are allowed. 
The IDCT specification is given in Annex A. 

In 7.1 through 7.6 the simplest decoding process is ^ecified in which no scalability features are used 7.7 
to 7.11 specify the decoding process when scalable extensions are used. 7.12 defines the output of the 
decoding process. 

Figure 7-1 is a diagram of the Video Decoding Process without any scalability. The diagram is simplified 
for clarity. 

NOTE - Throughout this specification two dimensional arrays are represented as name[q]\p] where 
^q* is the index in the vertical dimension and the index in the horizontal dimension. 
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Figure 7-1. Simplified Video Decoding Process 



7.1 



Higher syntactic structures 



The various parameters and flags in the bitstream for macroblockQ and all syntactic structures above 
macroblockQ shall be interpreted as indicated in clause 6. Many of these parameters and flags affect the 
decoding process described in the following clauses. Once all of the macroblocks in a given picture have 
been processed the entire picture will have been reconstructed 

Reconstructed fields shall be associated together in pairs to form reconstructed firames. (See 
**picture_structure" in 6.3.10.) 

The sequence of reconstmcted firames shall be reordered as described in 6. 1 . 1 . 1 1 . 

V 

If progressive_sequence = 1 the reconstructed fi*ames shall be output firom the decoding process at 
regular intervals of the fi'ame period as shown in Figure 7-19. 
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If progressive_.sequence = 0 the leconstructed frames shall be broken into a sequence of fields wUch 
shall be output from the decoding process at regular intervals of the field period as shown in Figure 7-20. 
In the case that a fi^me picture has repeat_first_field == 1 the first field of the fi^e shall be repeated 
after the second field. (See ''repeat_first_field" in 6.3.10.) 

7.2 Variable length decoding 

7.2.1 specifies the decoding process used for the DC coefficient (n=0) in an intra coded block, (n is the 
index of the coefficient in the appropriate zigzag scanning order.) 7.2.2 specifies the decoding process for 
all other coefficients; AC coefficients (/i^) and DC coefficients in non-intra coded blocks. 

Let cc denote the colour component It is related to the blodc number as specified in Table 7-1 . Thus cc 
is zero for the Y component, one for the Cb component and two for the Cf component 



Table 7-1. Definition of cc, colour component index 



Block Number 


cc 


4:2:0 


4:2:2 


4:4:4 


0 


0 


0 


0 


1 


0 


0 


0 


2 


0 


0 


0 


3 


0 


0 


0 


4 


1 


1 


1 


5 


2 


2 


2 


6 




1 


1 


7 




2 


2 


8 






1 


9 






2 


10 






1 


11 






2 



7.2.1 DC coefficients in intra blocks 

DC coefficients in blocks in intra macroblocks are encoded as a variable length code denoting dct_dc_size 
as defined in Table B-12 and B-13. If dct_dc_size is not equal to zero then this shall be followed by a 
fixed length code, dc_dct_diiferential, of dct_dc__size bits. A differential value is first recovered fi'om the 
coded data which is added to a predictor in order to recover the final decoded coefficient 

If cc is zero then Table B-12 shall be used for dct_dc_size. If cc is non-zero then Table B-13 shall be used 
for dct_dc_size. 

Three predictors are maintained, one for each of the colour components, cc. Each time a DC coefficient in 
a block in an intra macroblock is decoded the predictor is added to the difierential to recover the actual 
coefficient. Then the predictor shall be set to the value of the coefficient just decoded. At various times, as 
described below, the predictors shall be reset. The reset value is derived fi'om the parameter 
intra_dcjprecision as specified in Table 7-2. 
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Table 7-2. Relation between intra.dc_precision and tiie predictor reset value 



intra_dc_precision 


Bits of precision 


reset value 


0 


8 


128 


1 


9 


256 


2 


10 


512 


3 


11 


1024 



The predictors shall be reset to the reset value at the following times: 

• At the start of a slice. 

• Whenever a non-intra macroblock is decoded. 

• Whenever a macroblock is. skipped, i.e. when macroblock_address_increment > 1. 
The predictors are denoted dc_dct_pred[cc]. 

QFS[0] shall be calculated from dc_dct_size and dc_dct_di£ferential by any process equivalent to: 

if ( dc_dct_si2e = 0 ) { 

dct_diff= 0; 
} else { 

halfj-ange = 2^( dc_dct_size - 1 ); Note ^ denotes power (notXOR) 

if ( dc_dct__difEerential >= half_range ) 
dct_diff= dc_dct_differential; 

else 

dct_diff= (dc_dct_differential + 1) - (2 ♦ halLrange); 

} 

QFS[Q\ = dc_dct _pred[cc\ + dctjiiff, 
dc_dct_pred[cc] = 0FS[O] 

NOTE - dct_diff and half^ange are temporary variables which are not used elsewhere in this 
specification. 

It is a requirement of the bitstream that QFS[0] shall lie in the range: 

0 to ((2^(8 + intra.dc precision))- 1 ) 

7.2.2 Other coefficients 

All coefficients with the exception of the DC intra coefficients shall be encoded using Tables B-14, B-15 
andB-16. 

In all cases a variable length code shall first be decoded using either Table 14 or Table B- 15. The 
decoded value of this code denotes one of three courses of action: 
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1 End of Block. In this case there are no more coefficients in the block in which case the 
remainder of the coefficients in the block (those for which no value has yet been decoded) shall 
be set to zero. This is denoted by ''End of block** in the syntax specification of 6.2.6. 

2 A **&CTmal" coefficient in whidi a value of run and level is decoded followed by a single bit, 5, 
giving the sign of the coefficient signed Jevel is computed fix)m level and s as shown below, run 
coefficients shall be set to zero and the subsequent coefficient shall have the value signedjevel. 

signedjevel = level, 

else 

signed^level = (-level); 

3 An ''Escape" coded coefficient. In which the values of run and signed_level are fixed length 
coded as described in 7.2.2.3. 

7.2.2.1 Table selection 

Table 7-3 indicates which Table shall be used for decoding the DCT coefficients. 



Table 7-3. Selection of DCT coefficient VLC tables 



uitra_vlc_format 


0 


1 


intra blocks 


B-14 


B-15 


(macroblocK..intra = 1) 






non-intra blocks 


B-14 


B-14 


(macroblocK_intra = 0) 







7.2.2.2 First coefficient of a non-intra block 

In the case of the first coefficient of a non-intra block (a block in a non-intra macroblock) Table B-14 is 
modified as indicated by "NOTE 2" and "NOTE 3'* at the foot of diat Table. 

This modification only affects the entry that represents run = 0, level = ±1. Since it is not possible to 
encode an End of block as the first coefficient of a block (the block woiild be "not coded'* in this case) no 
possibility for ambiguity exists. 

The positions in the syntax that use this modified Table are denoted by "First DCT coefficient" in the 
syntax specification of 6.2.6. The remainder of the coefficients are denoted by "Subsequent DCT 
coefficients**. 

NOTE - In the case that Table B-14 is used for an intra block, the first coefficient shall be coded as 
specified in 7.2. 1 . Table B-14 shall therefore not be modified as the first coefficient that uses 
Table B-14 is the second coefficient in the block. 

7.2.2 J Escape coding 

Many possible combinations of run and level have no variable length code to represent them. In order to 
encode these statistically rare combinations an Escape coding method is used. 

Table B-16 defines the escape coding method. The Escape VLC is followed by a 6-bit fixed length code 
giving "rwn". This is followed by a 12-bit fixed length code giving the values of "signed JeveP*, 
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NOTE - Attention is drawn to the fact that the escape coding method used in this specification is 
different to that used in ISO/IEC 1 1 172-2. 



7J1J1A Summary 

To summarise 7.2.2. The variable length decoding process shall be equivalent to the following. At the 

start of this process n shall take the value zero for non-intra blocks and one for intra blocks. 

eob_not_read = 1; 
while ( eob_noi^read ) 

{ 

<decode VLC, decode Escape coded coefficient ifrequired> 
if ( <decoded VLC indicates End ofblock> ) { 
eob_not_read - 0; 
while ( n< 64) { 
QFS[n\ - 0; 
/i = n+ 1; 

} 

} else { 

for ( m = 0; m < run\ m-H- ) { 
QFS[n] = 0; 
n = 1; 

} 

QFS[n\ = signed Jevel 
ii = n+ 1; 

} 

} 

NOTE- eob_not_read and m are temporary variables that are not used elsewhere in this 
specification. 



73 Inverse scan 

Let the data at the output of the variable length decoder be denoted by QFS[n]. n is in the range 0 to 63. 

This clause specifies the way in which the one-dimensional data, QFS[n], is converted into a two- 
dimensional array of coe£5cients denoted by 6i^v][u]. u and v both lie in the range 0 to 7. 

Two scan patterns are defined. The scan that shall be used shall be determined by altemate_scan which is 
encoded in the picture coding extension. 

Figure 7-2 defines 5can[altemate_3can][v][u] for the case that altemate_scan is zero. Figure 7-3 defines 
fcan[a}temate_scan][v][ti] for the case that altemate_scan is one. 
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Figure 7-2. Defmition of 5ca/i[0]|v](ii] 
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Figure 7-3. Definition of scan[l\[v\{u\ 



The inverse scan shall be any process equivalent to the folloAving: 

for (v=0; v<8; v++) 
for («=0; w<8; m-h-) 

0F[v][«] = eFiS[5caii[altemate_scan][v][M]] 

NOTE - The scan patterns defined here are often referred to as '"zigzag scanning order". 
73.1 Inverse scan for matrix download 

When the quantisation matrices are downloaded they are encoded in the bitstream in a scan order that is 
converted into the two-dimensional matrix used in the inverse quantiser in an identical manner to that 
used for coefficients. 

For matrix download the scan defined by Figure 7-2 (i.e. jca;2[0][v][u]) shall always be used 

Let fF[w][v][M] denote the weighting matrix in the inverse quantiser (see 7.4.2.1), and ^[W][n] denote the 
matrix as it is encoded in the bitstream. The matrix download shall then be equivalent to the following: 

for (\^; v<8; 

for (m=0; M<8; u-H-) 

FF[w][v][m] = ;F[w][jcan[0][v][ii]] 
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7.4 



Inverse quantisation 



The two-dimensional array of coefQdents, inverse quantised to produce the reconstructed 

DCT coefiBcients. This process is essentially a muldplication by the quantiser step size. The quantiser 
step size is modified by two mechanisms; a weighting matrix is used to modiiy the step size within a block 
and a scale factor is used in order that the step size can be modified at the cost of only a few bits (as 
compared to encoding an entire new weighting matrix). 



QFlv][u] 



Inverse 
Quantisation 
Arithmetic 




Flv][u] 





quant_scale_code 
W[w][v][u] 



Figure 7-4. Inverse quantisation process 



Figure 7-4 illustrates the overall inverse quantisation process. After the appropriate inverse quantisation 
arithmetic the resulting coefficients, Flv][i/], are saturated to yield F[v][u] and then a mismatch control 
operation is performed to give the final reconstructed DCT coefficients, /^v][u]. 

NOTE- Attention is drawn to the &ct that the method of achieving mismatch control in this 
specification is different to that employed by ISO/IEC 1 1 172-2. 

7.4.1 Intra DC coeflicient 

The DC coefficients of intra coded blocks shall be inverse quantised in a different manner to all other 
coefficients. 

In intra blocks F"[0][0] shall be obtained by multiplying 6F[0][0] by a constant multiplier, 
intra_dc_mult, (constant in the sense that it is not modified by either the weighting matrix or the scale 
factor). The multiplier is related to the parameter intra__dc_precision that is encoded in the picture coding 
extension. Table 7-4 specifies the relation between intra.dc^redsion and intra_dc_jmdL 



Table 7-4. Relation between intra_dc_precision and intra_dc_muit 



intra_dc_precision 


Bits of precision 


intra_dc_mult 


0 


8 


8 


1 


9 


4 


2 


10 


2 


3 


11 


1 



Thus; F ' [010] = intra_ dc^mult x QF[0][0] 
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1AJ2 Other coefficients 

All coefficients other than the DC coefficient of an intra block shall be inverse quantised as specified in 
this clause. 

7.4^.1 Weighting matrices 

When 4:2:0 data is used two weighting matrices are used One shall be used for intra macroblocks and 
the other for non-intra macroblocks. When 4:2:2 or 4:4:4 data is used, four matrices are used allowing 
different matrices to be used for luminance and chrominance data. Each matrix has a default set of values 
which may be overwritten by down-loading a user defined matrix as e?q)lained in 6.2.3.2. 

Let the weighting matrices be denoted by ^^w][v][w] \^ere w takes the values 0 to 3 indicating which of 
the matrices is being used. Table 7-5 summarises the rules governing the selection of vt^. 



Table 7-5. Selection of w 





4:2:0 


4:2:2 and 4:4:4 




luminance 


chrominance 


luminance 


chrominance 




(cc = 0) 


(cc 9fc 0) 


(cc = 0) 


(cc#0) 


intra blocks 


0 


0 


0 


2 


(macroblockjntra = 1) 










non-intra blocks 


1 


1 


1 


3 


(macroblockjntra = 0) 











7.4.2.2 Quantiser scale factor 

The quantisation scale fector is encoded as a 5 bit fixed length code, quantiser_scale_code. This indicates 
the appropriate quantiser^cale to apply in the inverse quantisation arithmetic. 

q_scale_type (encoded in the picture coding extension) indicates \^ch of two mappings between 
quantiser_3cale_code and quantiser^cale shall apply. Table 7-6 shows the two mappings between 
quantiser_3cale_code and quantiser_^cale. 
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Table 7-6. Relation between quandser^ale and quantiser_scale_code 





qnantiser_scale[q_scale_type] 


quantiser_scale_code 


q_scale.type = 0 


q_scale_type = 1 


0 


(forbidden) 


1 


2 


1 


2 


4 


2 


3 


6 


3 


4 


8 


4 


5 


10 


5 


6 


12 


6 


7 


14 


7 


8 


16 


8 


9 


18 


10 


10 


20 


12 


11 


22 


14 


12 


24 


16 


13 


26 


18 


14 


28 


20 


15 


30 


22 


16 


32 


24 


17 


34 


28 


18 


36 


32 


19 


38 


36 


20 


40 


40 


21 


42 


44 


22 


44 


48 


23 


46 


52 


24 


48 


56 


25 


50 


64 


26 


52 


72 


27 


54 


80 


28 


56 


88 


29 


58 


96 


30 


60 


104 


31 


62 


112 



7.4.23 Reconstruction formulae 

The following equation specifies the arithmetic to reconstruct F'Iv][u] from 6/lv][u] (for all coefiBcients 
except intra DC coefScients). 
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F*' [v][u] = {(2 xQFlv][u]'\'k)xW[w][v][u]x quantiser _ scale)l32 
where: 

intra blocks 
non - intra blocks 



[SigriQf[v][u]) 



NOTE - The above equation uses the "/** operator as defined in 4. 1 . 
7.43 Saturation 

The coefScients resulting from the Inverse Quantisation Arithmetic are saturated to lie in the range 
[-2048:+2047]. Thus: 

2047 F**[v][«]>2047 
F'[v][m]= iF"[v][«] -2048<F"[v][ii]<2047 

-2048 F"[v][m]<-2048 



7.4.4 Mismatch control 

Mismatch control shall be performed by any process equivalent to the following. Firstly all of the 
reconstructed, saturated coefficients, F*fv][u] in the block shall be summed. This value is then tested to 
detennine whether it is odd or even. If the sum is even then a correction shall be made to just one 
coeflSdent; F[7][7]. Thus: 

v<8 u<8 
v=Otf=0 

F[vlu] =F^[v][u] foraUu, vexcept ii = v = 7 

r F'[7I7] iframisodd 

^7T7j^ rF'[7][7]-l ifF'[7][7]isoddl 

n r if sum is even 

[1f'[71[7] + 1 ifF*[7][7]isevenJ 

NOTES - 

1 It may be useful to note that the above correction for i^7][7] may simply be implemented by 
toggling the least significant bit of the twos complement representation of the coefiBcient. 
Also since only the "oddness" or "evenness" of the sum is of interest an exclusive OR (of just 
the least significant bit) may be used to calculate ^sunC\ 

2 Warning. Small non-zero inputs to the IDCT may result in zero output for compliant 
IDCTs. If this occurs in an encoder, mismatch may occur in some pictures in a decoder that 
uses a different compliant IDCT. An encoder should avoid this problem and may do so by 
checking the output of its own IDCT. It should ensure that it never inserts any non-zero 
coefScients into the bitstream v/hea the block in question reconstructs to zero through its 
own IDCT function. If this action is not taken by the encoder, situations can arise where 
large and very visible mismatches between the state of the encoder and decoder occur. 

7.4.5 Summary 

In summary the inverse quantisation process is any process numerically equivalent to: 
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for (v=0; v<8;v+4-) { 
for (u=0; w<8;w++) { 

if ( (m=0) && iv=0) && (macrobloclemtra) ) { 

F'lv][u] = intrajk^mult * QF\v\[u\\ 
} else { 

if ( macroblock_iiitra ) { 

F'[v\[u\ = ( QF[v][u\ * J^w][v][w] * quantiserjscale * 2 ) / 32; 
} else { 

F'lv][u] = ( ( ( Qmiu] * 2 ) + 5(gn(e/^v]M) ) * W[w][v][u] 

* quantiserjscale ) / 32; 

} 

} 

} 

} 



sum = Q\ 

for (v=0; v<8;v4-i-) { 
for (m=0; u<8;irH-) { 

if(F'Tv][w]>2047){ 

F'[v]M = 2047; 
} else { 

if(F'Mii]<-2048){ 

F\v][u] = -2048; 
} else { 

/"[v]M=F'Tv][«]; 

} 

} 

sum = sum + -F'[v][«]; 
^v]M = FTv][M]; 

} 

} 



if ((ram & I) = 0) { 

if (ram & 1) !=0){ 

^ else { 

i^7][7]=FT7][7] + l; 

} 



7.5 Inverse DCT 

Once the DCT coefiSdents, /^v][u], are reconstructed, the inverse DCT transform defined in Annex A 
shall be applied to obtain the inverse transformed values, y[y][jc]. These values shall be saturated so that: - 
256 <J[yM < 255, for all jc, y. 



86 RecoDunendation ITU-T H.262 (1995 £) 



© ISO/TEC 



ISOyi£C 13818-2: 1995 (E) 



7JS.1 Non-coded blocks and skipped macroblocks 

In a macroblock that is not skipped, if pattem_code[i] is one for a given block in the macroblock then 
coefficient data is included in the bitstream for that block. This is decoded using as specified in the 
preceding clauses. 

However, if pattem_code[i] is zero, or if the macroblock is skipped, then that block contains no coefficirat 
' data. The sample domain coefficients J{y][x] for sudi a block shall all take the value zero. 

7.6 Motion compensation 

The motion compensation process forms predictions from previously decoded pictures which are 
combined with the coefficient data (from the output of the IDCT) in order to recover the final decoded 
samples. Figure 7-5 shows a simplified diagram of this process. 

In general up to fom separate predictions are formed for each block which are combined together to form 
the final prediction block p[y][x]. 

In the case of intra coded macroblocks no prediction is formed so that p[y][x] will be zero. The saturation 
shown in Figure 7-5 is still required in order to remove negative values from f[y][x]. Intra coded 
macroblocks may carry motion vectors known as "concealment motion vectors". Despite this no 
prediction is formed in the normal course of events. This motion vector information is intended for use in 
the case that bitstream errors preclude the decoding of coefficient information. The way in which a 
decoder shall use this information is not specified. The only requirement for these motion vectors is that 
they shall have the correct syntax for motion vectors. A description of the way in which these motion 
vectors may be used can be found in 7.6.3.9. 

In the case where a block is not coded, either because the entire macroblock is skipped or the specific 
block is not coded there is no coefficient data. In this case ily][x] is zero and the decoded samples are 
simply the prediction, p[y][x]. 
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Figure 7-5. Simplified motion compensation process 
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7.6.1 Prediction modes 

There are two major classifications of the prediction mode: field prediction and fi'ame prediction. 

In field prediction, predictions are made independentiy for each field by using data from one or more 
previously decoded fields. Frame prediction forms a prediction for the frame from one or more previously 
decoded fi^es. It must be understood that the fields and fiBmes from v^ch predictions are made may 
themselves have been decoded as either field pictures or frame pictures. 

Within a field picture all predictions are field predictions. However in a frame picture either field 
predictions or frame predictions may be used (selected on a macroblock-by macroblock basis). 

In addition to the major classification of field or frame prediction two special prediction modes are used: 



88 



Recommendation ITU-T H.262 (1995 E) 



© ISO/ffiC 



ISO/mC 13818-2: 1995 (£) 



16x8 motion compensation. In which two motion vectors are iised far each macroblock. The 
first motion vector is used for the upper 16x8 region, the second for the lower 16x8 region, hi 
the case of a bidirectionally predicted macroblodc a total of four motion vectors will be used since 
there will be two for the forward prediction and two for the badcward prediction. In this 
specification 16x8 motion compensation shall only be used with field pictures. 

Dual-prime, hi which only one motion vector is encoded (in its full format) in the bitstream 
together with a small differential motion vector. In the case of field pictures two motion vectors 
are then derived fi-om this information. These are used to form predictions 6cm two reference 
fields (one top, one bottom) which are averaged to form the final prediction. In the case of fitune 
pictures this process is repeated for the two fields so that a total of four field predictions are 
made. This mode shall only be used in P-pictures where there are no B-pictures between the 
predicted and reference fields or frames. 



7.6.2 Prediction field and frame selection 

The selection of ^ich fields and frames shall be used to form predictions shall be made as detailed in this 
clause. 

7.6.2.1 Field prediction 

In P-pictures, the two reference fields from which predictions shall made are the most recently decoded 
reference top field and the most recentiy decoded reference bottom field. The simplest case illustrated in 
Figure 7-6 shall be used when predicting the first picture of a coded firame or \^en using field prediction 
within a fi^me-picture. In these cases the two reference fields are part of the same reconstructed fi^me. 

NOTES- 

1 The reference fields may themselves have been reconstructed from two field-pictures or a 
single frame-picture. 

2 In the case of predicting a field picture, the field being predicted may be either the top field 
or the bottom field. 




Possible 
Intervening 
B-pictures 
(Not yet decoded) 



Figure 7-6. Prediction of the flrst field or field prediction in a frame-picture 

The case when predicting the second field picture of a coded frame is more complicated because the two 
most recentiy decoded reference fields shall be used, and in this case, the most recent reference field was 
obtained from decoding the first field picture of the coded frame. Figure 7-7 illustrates the situation when 
this second picture is the bottom field. Figure 7-8 illustrates the situation when this second picture is the 
top field. 
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NOTE - The earlier reference field may itself have been reconstructed by decoding a field picture or a 
frame picture. 




Intervening 
B-pictures 
(Not yet decoded) 

Figure 7*7. Prediction of the second lield-picture when it is the bottom field 
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Figure 7-8. Prediction of the second field-picture when it is the top field 

Field prediction in B-pictures shall be made from the two fields of the two most recently r 
reference frames. Figure 7-9 illustrates this situation. 



LStructed 



NOTE - The reference frames may themselves have been reconstructed from two coded field-pictures 
or a single coded frame-picture. 
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I Reference 
V Field 




Possible 
Intervening 
B*pictures 
(Already decoded) 



Possible 
Intervening 
B-pictures 
(Not yet decoded) 



Top 
Reference 

Field 

Bottom 

Reference 
Field 



Figure 7-9. Field-prediction of B field pictures or B frame pictures 



7.6.2.2 Frame prediction 

In P-pictures prediction shall be made from the most recently reconstructed reference frame. This is 
illustrated in Figure 7*10. 

NOTE - The reference frame may itself have been reconstructed from two field pictures or a single 
frame picture. 




Possible 
Intervening 
B-pictures 
(Not yet decoded) 



Figure 7-10. Frame-prediction for I-pictures and P-pictures 

Similarly firame prediction in B-pictures shall be made from the two most recently reconstructed reference 
frames as illustrated in Figure 7-1 1 . 

NOTE - The reference frames themselves may each have been reconstructed from two field pictures 
or a single fi^e picture. 



Reference 
Frame 




Possible 
Intervening 
B-pictures 
(Already decoded) 



L_J 

Possible 
Intervening 
B-pictures 
(Not yet decoded) 



Figure 7-11. Frame-prediction for B-pictures 
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7.63 Motion vectors 

Motion vectors are coded difierentially with respect to previously decoded motion vectors in order to 
reduce the number of bits required to represent them. In order to decode the motion vectors the decoder 
shall maintain four motion vector predictors (each with a horizontal and vertical con^onent) denoted 
PMy[r][5][t]. For each prediction, a motion vector, vector\r][s][t] is first derived. This is then scaled 
depending on the sampling structure (4:2:0, 4:2:2 or 4:4:4) to give a motion vector, vec/or[r][5][/], for 
each colour component The meanings associated with the dimensions in this array are defined in 
Table 7-7. 



Table 7-7. Meaning of indices in FMVlr]ls]lth vectorir]ls\lt\ and vector 'lr][s]lt] 





0 


1 


r 


First motion vector in Macroblock 


Second motion vector in Macroblock 


s 


Forward motion Vector . 


Backwards motion Vector 


t 


Horizontal Component 


Votical Component 


NOTE- 


r also takes the values 2 and 3 for derived motion vectors used with dual- 
prime prediction. Since these motion vectors are derived they do not 
themselves have motion vector predictors. 



7.63.1 Decoding the motion vectors 

Each motion vector component, vector\r\[s\[t\, shall be calculated by any process that is equivalent to the 
following one. Note that the motion vector predictors shall also be updated by this process. 
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resize =f_code[s][t] - 1 
/= 1 « resize 
high = (l6*f)- 1; 

/aw-((-16)*/); 
ran^e = ( 32 * /); 

if ( (T = 1) II (motian_code[r][s][t] = 0) ) 

delta = motion_code[r][s][t] ; 
else { 

delta = ( (i4fo(motion_code[r][s][t]) - 1 ) * f ) + motion_residuai[r][s]tt] + 1; 
if (motion_code[r][s][t] < 0) 
delta = - delta; 

) 

prediction = PMV[r][s][t]; 

if ( (mv_format = "field") && (/=!) && (picture.structure == 'Trame picture") ) 
prediction == PMV[r][5][t] DIV 2; 

vector '[r] [j] [/]= prediction + </e//a; 

if (vector VIM W < /ow) 

vector *[r][j][/] = vector '[r] [5] [/] + range; 
if (vector lr][sM> high) 

vector '[r][s][t] = vector '[r][.s][/] - range; 

if ( (mv.format = "field") && (t=l) && (picture.structure = *Tramc picture") ) 
PMV[r][sM = vector '[r][s][t] * 2; 

else 

PAmrMU] = vector '[r][5]M; 



The parameters in the bitstream shall be such that the reconstructed differential motion vector, delta, shall 
lie in the range [lowihigh]. In addition the reconstructed motion vector, vector'[r][s][t], and the updated 
value of the motion vector predictor iWF[r][j][r], shall also lie in the range [low : high]. 

r _^izeyfy delta, high , low and range are temporary variables that are not used in the remainder of this 
specification. 

motion_code[r][s][t] and motion_residual[r][s][t] are fields recovered firom the bitstream. mv_format is 
recovered fi-om the bitstream using Table 6-17 and Table 6-18. 

r, s and t specify the particular motion vector component being processed as identified in Table 7-7. 
vector \r][s][t] is the final reconstructed motion vector for the luminance component of the macroblock. 

7.6 J.2 Motion vector restrictions 

In frame pictures, the vertical component of field motion vectors shall be restricted so that they only cover 
half the range that is supported by the f_code that relates to those motion vectors. This restriction ensures 
that the motion vector predictors will always have values that are appropriate for decoding subsequent 
fitune motion vectors. Table 7-8 summarises the size of motion vectors that may be coded as a fimction of 
f code. 
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Table 7-8. Allowable motion vector range as a function of f_codels]lt\ 



fj:odels\lt] 


Vertical components (r=l) of 
field vectors in frame pictures 


All other cases 


0 


(forbidden) 




1 


[-4: +3,5] 


[-8: +7,5] 


2 


[-8: +7,5] 


[-16: +15,5] 


3 


[-16: +15.5] 


[-32: +31,5] 


4 


[-32: +31,5] 


[-64: +63,5] 


5 


[-64; +63,5] 


[-128: +127,5] 


6 


[-128: +127,5] 


[-256: +255,5] 


7 


[-256: +255,5] 


[-512: +511,5] 


8 


[-512: +511,5] 


[-1024: +1023,5] 


9 


[-1024: +1023,5] 


[-2048: +2047,5] 


10-14 


(reserved) 




15 


(used when a particular /jc;oJe[5][/] will not be used) 



7.63.3 Updating motion vector predictors 

Once all of the motion vectors present in the macroblock have been decoded using the process defined in 
the previous clause it is sometimes necessary to iqxiate other motion vector predictors. This is because in 
some prediction modes fewer than the maximum possible number of motion vectors are used. The 
remainder of the predictors that might be used in the picture must retain "sensible" values in case they are 
subsequently used. 

The motion vector predictors shall be updated as specified in Table 7-9 and 7-10. The rules for updating 
motion vectcn* predictors in the case of skipped maaoblocks are specified in 7.6.6. 

NOTE - It is possible for an implementation to qjtimise the updating (and resetting) of motion vector 
predictors depending on the picture type. For example in a P-picture the predictors for 
backwards motion vectors are unused and need not be maintained 
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Table 7-9. Updating of motion vector predictors in frame pictures 



frame_motion_- 


macroblock_motion_- 


macroblock_- 




type 


forward 


backward 


intra 


Predictors to Update 


Frame-basedt 


- 


- 


1 


PMF11][0][1:0] =/W[0][0][l:0]^ 


Frame-based 


1 


1 


0 


PMV[l][0][l:0]=PMV[0][0]ll:0] 










PMni][l][l:0] =/W[0][l][l:0] 


Frame-based 


1 


0 


0 


PA/^tl][0][l:01 =PA/F[0][0][1:0] 


Frame-based 


0 


1 


0 


PAfF[l][l][l:0] =PAfF[0][i][l:0] 


Frame-based^ 


0 


0 


0 


PMVlrMlt] = 0 § 


Field-based 


1 


1 


0 


(none) 


Field-based 


1 


0 


0 


(none) 


Field-based 


0 


1 


0 


(none) 


Dual prime 


1 


0 


0 


PMF[11[0][1:0] = />A/FIO]tO][l:0] 


NOTE - PMV[r][s][l :0] = PMV[u][v][\ :0] means that; 




PMV[r][s][l]=PMV[u][v\[l] and PMV[r][s][0] 


= PMV[u]lv][0] 


^ If coiicealment_motion_vectors is zero then PAfF[r][f][/] is set to zero (for all r, s and 


t frame_motion_type is not preset in the bitstream but is assumed to be Frame-based 


§ (Only occurs in P-picture) PMy[r][s]lt] is set to zero (for all r, s and /). See 7.6.3.4 
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Table 7-10. Updating of motion vector predictors in field pictures 



fieId_mQtion_- 


macroblock_motion_- 


macroblock_- 






type 


forward 


backward 


intra 


Predictors to Update 




Field-basedt 


- 


- 


1 


PAfF[l][0][l:0] =PMV[0][0][i 


:0]0 


Field-based 

• 


1 


1 


0 


PMV[l]lO]ll:0] = PMV[0][0][l: 


0] 










PMV[l][l][l:0] = PMV[0][l][l. 


;0] 


Field-based 


1 


0 


0 


/»MF[1][01[1:01 = PAfV[0][0][l. 


0] 


Field-based 


0 


1 


0 


/W[l][l][l:01 = /WIO][l][l: 


;0] 


Field-based^ 


0 


0 


0 


PMV[r][s][t] = 0 § 




16x8 MC 


1 


1 


0 


(none) 




16x8 MC 


1 


0 


0 


(none) 




16x8 MC 


0 


1 


0 


(none) 




Ehial prime 


1 


0 


0 


PMVl\][0][l:0]=PMyi0][0][l:O] 


NOTE - PMV[r][s][l:0] = /W[m][v][1 :0] means that; 






FMV[r][s][\] = FMV[u][v][\] andPMF[r][5][0] 


= PMV[u][v][0] 




^ If concealment_motion_vectors is zero then i'A/y[r][j][/] is set to zero (for all r, s and 


t field_motion_type is not present in the bitstream but is assumed to be Field-based 




§ (Only occurs in P-picture) PMV[r][s][t] is set to zero (for all r, s and r). See 7.6.3.4 





7.63.4 Resetting motion vector predictors 

All motion vector predictors shall be reset to zero in the following cases: 

• At the start of each slice. 

• Whenever an intra macroblock is decoded which has no concealment motion vectOTS. 

• In a P-picture v/hca a non-intra macroblodc is decoded in which macroblock^otion_forward is 
zero. 

• In a P-picture ^en a macroblock is skipped. 

7.63.5 Prediction in P-pictures 

In P-pictures, in the case that macroblockjnotion Jbrward is zero and macroblock_intra is also zero no 
motion vectors are encoded for the macroblock yet a prediction must be formed. If this occurs in a P field 
picture the following apply; 

• The prediction type shall be ^Tield-based" 

• The (field) motion vector shall be zero (0;0) 

• The motion vector predictors shall be reset to zero 

• Predictions shall be made firom the field of the same parity as the field being predicted 
If this occurs in a P frame picture the following apply; 
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• The prediction type shall be 'Trame-based" 

• The (frame) motion vector shall be zero (0;0) 

• The motion vector predictors shall be reset to zero 

In the case that a P field picture is used as the second field of a frame in which the first field is an I field 
picture a series of semantic restrictions apply. These ensure that prediction is only made from the I field 
picture. These restrictions are; 

• There shall be no maCToblocks that are coded with macroblock_rnotion^orward zero and 
macroblock_intra zero. 

Dual prime prediction shall not be used. 

• Field prediction in which motion_vertical_fieId_select indicates the same parity as the field 
being predicted shall not be used. 

• There shall be no skipped macroblocks. 
7.63.6 Dual prime additional arithmetic 

In dual prime prediction one field motion vector (vec/or '[0][0][1:0]) will have been decoded by the 
process already described. This represents the motion vector used to form predictions from the reference 
field (or reference fields in a frame picture) of the same parity as the prediction being formed. Here the 
word '"parit/' is used to differentiate the two fields. The top field has parity zero, the bottom field has 
parity one. 



4.5 ® 



4X 
4.5 ® 

Top Bottom 

Reference Picture 



Derived Vectors 

o 




Top 



o 



Bottom 



Picture Being 
Predicted 



Field Vector 
from bitstream 



Figure 7-12. Scaling of motion vectors for dual prime prediction 

In order to form a motion vector for the opposite parity (vec/or'[r][0][l :0]) the existing motion vector is 
scaled to reflect the different temporal distance between the fields. A correction is made to the vertical 
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component (to reflect the vertica] shift b^ween the lines of top field and bottom field) and then a small 
differential motion vector is added. This process is illustrated in Figure 7-12 which shows the situation 
for a fiame picture. 

dmvector{0] is the horizontal component of the differential motion vector and dmvector{\] the vertical 
component. The two components of the differential motion vector shall be decoded directly using 
Table B-1 1 and shaU take only one of the values -1, 0, +1. 

m\parity_rej\]parity_pred\ is the field distance between the predicted field and the reference field as 
defined in Table 7-11. '^''parityjref is the parity of the reference field for A\iiich the new motion vector is 
being computed. ^""panty^redT is the parity of the field that shall be predicted. 

e\pantyjref>^arity_pTed\ is the adjustment necessary to reflect the vertical shift between the lines of top 
field and bottom field as defined in Table 7*12. 



Table 7-11. Definition of m{parity_ref\lparity_pred\ 







m{panty_rej] {parity jpred\ 


picture_structure 


top_field_first 


m|l][01 


m[0][ll 


1 1 (Frame) 


1 


1 


3 


1 1 (Frame) 


0 


3 


1 


01 (Top Field) 




1 




10 (Bottom Field) 






1 



Table 7-12, Definition of e\pantyjref\\parity_pred\ 



parity_ref 


parify_pred 


e\parity_ref\ \parity_predl 


0 


1 


+1 


1 


0 


-1 



The motion vector (or motion vectors) used for predictions of opposite parity shall be computed as follows; 

vector '[r][0][0] = ((vector \0][0][O] * m\parity_ref\\parity_pred\)l/2) + <Anvector[0]; 
vector '[r][0][l] = ((vecrorio][0][l] * m{panty_reJllparity^red\)//2) 

+ e\parity_r^/\{parity _pred\ + dmvector\\\y 

In the case of field pictures only one such motion vector is required and here r=2. Thus the (encoded) 
motion vector used for the same parity prediction is vector'[0][0][l:0] and the motion vector used for the 
opposite parity prediction is vector '[2] [0][1 :0]. 

In the case of frame pictures two such motion vectors are required. Both fields use the encoded motion 
vector (vec/or '[0][0][1 :0]) for predictions of the same parity. The top field shall use vector '[2] [0][ 1:0] for 
opposite parity prediction and the bottom field shall use vector '[3][0][1 :0] for opposite parity prediction. 

7.6 J.7 Motion vectors for chrominance components 

The motion vectors calculated in the previous clauses refer to the limiinance component where; 

vector[r][5]t/l = vector ^r] [5] [/] (for all r, s and t) 

For each of the two chrominance componoits the motion vectors shall be scaled as follows: 
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4:2:0 Both the harizontal and vertical CQmponents of the motion vector are scaled by dividing by two: 

vectorlr][s][0] = vector'[r][s]lO] 1 2; 
vector[r][5][l] = vectorlr][j][l] / 2; 

4:2:2 The horizontal component of the motion vector is scaled by dividing by two, the vertical 
component is not altered: 

vecfor[r][^][0] = vector\r\[s\[id] 1 2; 
vector{r][5][l] = vector '[r][s][l]; 

4:4:4 The motion vector is unmodified: 

vector[r]M[0] = vector '[r] [5] [0]; 
vector[r][5][l] = vector '[r][5][l]; 

7.63.8 Semantic restrictions concerning predictions 

It is a requirement on the bitstream that it shall only demand of a decoder that predictions shall be made 
fircnn slices actually encoded in a referCTce frame or reference field. This rule applies even for skipped 
macroblocks and macroblocks in P-pictures in which a zero motion vectcn* is assumed (as explained in 
7.6.3.5). 

NOTE - As explained in 6.1.2 it is, in general, not necessary for the slices to cover the entire picture. 

However in many defined levels of defined profiles the '"restricted slice structure'* is used in 
^^ch case the slices do cover the entire picture. In this case the semantic rule may be more 
simply stated: 'it is a restriction on the bitstream that reconstructed motion vectors shall not 
refer to samples outside the boundary of the coded picture.*' 

7.63.9 Concealment motion vectors 

Concealment motion vectors are motion vectors that may be carried by intra macroblocks for the purpose 
of concealing errors if data errors preclude decoding the coefficient data. A concealment motion vector 
shall be present for all intra macroblocks if (and only if) concealment_motion_vectors (in the 
picture_coding_extaisionO ) has the value one. 

In the normal course of events no prediction shall be formed for such maoroblocks (as would be expected 
since macrobloct_intra = 1). This specification does not specify how error recovery shall be performed. 
However it is a recommendation that concealment motion vectors are suitable for use by a decoder that 
performs concealment by forming predictions as if field_inotion_type and frame.motionjtype (from 
vdiich the prediction type is derived) have the following values: 

• In a field picture; field_jnotion_type = 'Tield-based" 

• In a frame picture; firamejnotion_type = "Frame-based" 

NOTE - If concealment is used in an I-picture then the decoder should perform prediction in a similar 
way to a P-picture. 

C(mcealment motion vectors are intended for use in the case that a data error results in information being 
lost. There is therefore little point in encoding the concealment motion vector in the macroblock for 
which it is intended to be used since if the data error results in the need for error recovery it is very likely 
that the concealment motion vector itself would be lost or com^ted As a result the following semantic 
rules are appropriate. 
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• For all macroiblocks except those in the bottom row of macroblocks concealment motion vectors 
should be appropriate for xise in the macroblock that lies vertically below the macroblock in 
which the motion vector occurs. 

• When the motion vector is used with respect to the macroblock identified in the previous rule a 
decoder must assume that the motion vector may refer to samples outside of the slices encoded in 
the reference frame or reference field. 

For all macroblocks in the bottom row of macroblocks the reconstructed concealment motion 
vectors will not be used. Th^efore the motion vector (0;0) may be used to reduce unnecessary 
overhead. 

7.6.4 • Forming predictions 

Predictions are formed by reading prediction samples from the reference fields or fi^mes. A given sample 
is predicted by reading the corresponding sample in the reference field or fi'ame ofiset by the motion 
vector. 

A positive value of the horizontal componoit of a motion vector indicates that the prediction is made from 
samples (in the reference field/frame) that lie to the right of the samples being predicted. 

A positive value of the vertical component of a motion vector indicates that the prediction is made from 
samples (in the reference field/firame) that lie the below the samples being predicted. 

All motion vectors are specified to an accuracy of one half sample. Thus if a component of the motion 
vector is odd, the samples will be read from mid-way between the actual samples in the reference 
field/frame. These half-samples are calculated by simple linear interpolation from the actual samples. 

In the case of field-based predictions it is necessary to determine which of the two available fields to use to 
form the prediction. In the case of dxial-prime this is specified in that a motion vector is derived for both 
of the fields and a prediction is formed from each. In the case of field-based prediction and 16x8 MC an 
additional bit, motiQn_verticaI_field_select, is encoded to indicate which field to use. 

If motion_verticaLfield.select is zero then the prediction is taken from the top reference field. 

If motion_yertical.field_select is one then the prediction is taken from the bottom reference field. 

For each prediction block the integer sample motion vectors int_vec[t] and the half sample flags 
ha]f„flag[t] shall be formed as follows; 

for (t=0; t<2; r+-f) { 

int_yec[t] = vector[r][s][t] DIV 2; 
if ((vector[r][s][t] - (2 ♦ int_vec[t]) != 0) 
halfjlag[t] = li 

else 

halfjlagit] = 0; 

} 

Then for each sample in the prediction block the samples are read and the half sample prediction applied 
as follows; 
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if ( (! halfJIag[0] )&& (! halfjlag[\]) ) 

pel ^red\y][x] -peljref\y + int_yec{\W{x + m/_vec[0]] ; 

if ( (! halfjiam )&& halfjlag[l] ) 

/7e/ ^'•e(/[y][x] = ipel_rej\y + i>i/_v6c[l]][x + m(_vec[0]] + 

peLreJly + i/iCvec[l]+l][x + int_vec[0]] ) // 2; 

if ( halfJlag[0]SLSc (! halfjlag[\]) ) 

/>e/^re£/[y][x] =(pel_rej\y + in(_vec[l]][x + /n/_vcc[0]] + 

peLrefly + m/_vec[l]][x + i>iCvec[0]+l] ) // 2; 

if ( /wiO«^0]&& /ra//y7flg[l] ) 

pel_predly][x] = {pel_rej\y + in/_vec[l]][x + i>i/_vec[0]] + 

peLreJly + i/iCvec[l]][x + iiiCvec[0]+l] + 
peLre/ly + int_yec[l]^l][x + i/i/_vec[0]] + 

peLreJly + m/Lvec[l]+l][jc + i/i/_vec[0]+l] ) // 4; 

where pe/ ^red\y][x] is the prediction sample being formed and pel_ref\y][x] are samples in the reference 
field or frame. 

7.6.5 Motion vector selection 

Table 7-13 shows the prediction modes used in field pictures and Table 7-14 shows the predictions used in 
frame pictures. In each table the motion vectors that are present in the bitstream are listed in the order in 
which they appear in the bitstream. 
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Table 7>13. Predictions and motion vectors in field pictures 



field 






macro- 






motion 


macroblock_motion_- 


block..- 






type 


forward 


backward 


intra 


Motion vector 


Prediction formed for 








1 




None (motion vector is for 










concealment) 


Field-based 


1 


1 


0 


vector 10][0][1:0] 


\^o]e field, forward 










yectorlO][\][l:0] 


whole field, backward 


Field-based 


1 


0 


0 


vec/or 10][0][1:0] 


whole field, forward 


Field-based 


0 


1 


0 


vectortO][l][l:0] 


\^ole field, backward 


Field-basedt 


0 


0 


0 


vec/or 10][0][1:0]*§ 


whole field, forward 


16x8 MC 


1 
1 


1 
i 


0 


vector{0][0][\:0] 


upper 16x8 field, forward 










vector riirOiri:01 


lower 16x8 field, forward 










VCClOr L 1 J I i • " J 


upper loxo neio, oacKwara 










vectorll][l][l:0] 


lower ioxo neio, oacKwaru 


ioXo 


1 


0 


u 


vector \\)\ Lf J L 1 . u J 


Upper loxo neio, lorwara 










vector tl][0][l:0] 


lower 16x8 field, forward 


16x8 MC 


0 


1 


0 


vector 101[1][1:0] 


upper 16x8 field, backward 










vector [1][1][1:0] 


lower 16x8 field, backward 


Dual prime 


1 


0 


0 


vector 10][0][1:0] 


whole field, from same parity. 












forward 










vector I2][0][l:0]*t 


whole field, fi'om opposite parity, 










forward 



NOTE - Motion vectors are listed in the order they appear in the bitstream 



^ the motion vector is only present if concealment_motion_vectors is one 

t lieId_motion_type is not present in the bitstream but is assumed to be Field-based 

* These motion vectors are not present in the bitstream 

t These motion vectors are derived firom vector '[0][0][1 :0] as described in 7.6.3.6 

§ The motion vector is taken to be (0; 0) as explained in 7.6.3.5. 
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Table 7-14. Predictions and motion vectors in frame pictures 



frame_- 






macro- 








motion_- 


macroblock_motion_- 


bIock_- 








type 


forward 


backward 


intra 


Motion vector 




Prediction formed for 


Frame-basedt 






1 


vector[0][0][l:Of 


None (motion vector is for 
concealment) 


Frame-based 


1 


1 


0 


vector [0][0][l 
vector[0][l][\ 


0] 

;0] 


frame, forward 
frame, backward 


Frame-based 


1 


0 


0 


vector I0][0][l 


0] 


frame, forward 


Frame-based 


0 


1 


0 


vecforTOiriiri 


;0] 


frame, backward 


Frame-based^ 


0 


0 


0 


vector 10][0][\ 


;01*§ 


frame, forward 


Field-based 


1 


1 


0 


vector I0][0][l 
vector {l][0][\, 
vector [0][\][\: 
vector 


;0] 
;0] 
0] 
0] 


top field, forward 
bottom field, forward 
top field, backward 


Field-based 


1 


0 


0 


vector I0][0\ll: 
vector{\][0][h 


0] 
0] 


top field, forward 
bottom field, forward 


Field-based 


0 


1 


0 


vector I0]ll][l: 
vector ll][l][h 


0] 
0] 


top field, backward 
bottom field, backward 


Dual prime 


1 


0 


0 


vector 10][0][\'. 
vectorlO][0][l: 

m 


0] 
0] 


top field, from same parity, forward 

bottom field, from same parity, 
forward 










vector t2][0][l:0]*t 


top field, from opposite parity, 
forward 










vector I3][0][l:0]*t 


bottom field, from opposite parity, 
forward 



NOTE - Motion vectors are listed in the order they appear in the bitstream 



^ the motion vector is only present if concealment_motion_vectors is one 

t frame_motion_type is not present in the bitstream but is assumed to be Frame-based 

* These motion vectors are not present in the bitstream 

t These motion vectors are derived from vector '[0][0][1 :0] as described in 7.6.3.6 

§ The motion vector is taken to be (0; 0) as explained in 7.6.3.5 



7.6.6 Skipped macroblocks 

A skipped macroblock is a macroblock for which no data is encoded, that is part of a coded slice. Except 
at the start of a slice, if the number (macroblocl^address - previous_jnacrobloclC-address - 1) is larger 
than zero then this mmiber indicates the niunber of macroblocks that have been skipped. The decoder 
shall form a prediction for skipped macroblocks which shall then be used as the final decoded sample 
values. 
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The handling of skipped macroblocks is different between P-pictures and B-pictures. In addition the 
process differs b^ween field pictures and fitune pictures. 

There shall be no skipped macroblodcs in I-pictures except when: 

either picture_spatial_scalable_extensionO follows the picture^headerQ of the current picture. 

* 

or sequence^scalable_extensionO is present in the bitstream and scalable^mode = "SNR 
scalability^*. 

7.6.6.1 P field picture 

• The prediction shall be made as if field_motion_type is "Tield-based" 

• The prediction shall be made firom the field of the same parity as the field being predicted. 

• Motion vector predictors shall be reset to zero, 

• The motion vector shall be zero. 

7.6.6.2 P frame picture 

• The prediction shall be made as if firamejnotion_type is "Frame-based" 

• Motion vector predictors shall be reset to zero. 

• The motion vector shall be zero. 

7.6.6 J B field picture 

• The prediction shall be made as if fieldjtnotion.type is "Field-based'* 

• The prediction shall be made fi'om the field of the same parity as the field being predicted. 

• The direction of the prediction forward/backward/bi-directional shall be the same as the previous 
macroblock. 

• Motion vector predictors are unaffected. 

• The motion vectors are taken fixxm the appropriate motion vector predictors. Scaling of the 
motion vectors for colour components shall be performed as described in 7.6.3.7. 

7.6.6.4 B frame picture 

• The prediction shall be made as if fiame_potion_type is '"Frame-based" 

• The direction of the prediction forward/backward/bi-directional shall be the same as the previous 
macroblock. 

• Motion vector predictors are unaffected 

• The motion vectors are taken directly fi-om the appropriate moticm vector predictors. Scaling of 
the motion vectors for colour components shall be performed as described in 7.6.3.7. 
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7.6.7 



Combining predictions 



The final stage is to CQmbine the various predictioiis together in order to form the final prediction blocks. 

It is also necessary to organise the data into blocks that are either field organised or firame organised in 
order to be added directly to the decoded coefficients. 

The transform data is either field organised or frame organised as specified by dctjype. 



In the case of simple fitune predictions the only fiuther processing that may be required is to average 
forward and backward predictions in B-pictures. If pel _pred Jbrward[y\[x\ is the forwards prediction 
sample and pel _predj>achmrd[y\[x\ is the corresponding backward prediction then the final prediction 
sample shall be formed as; 

pel^predlyWx] = {pel_predJbr)M2rd\y\{x] + pel_pred_backward\y][x])//2; 

The predictions for chrominance components of 4:2:0, 4:2:2 and 4:4:4 formats shall be of size 8 samples 
by 8 lines, 8 samples by 16 lines and 16 samples fay 16 lines respectively. 

7.6.7.2 Simple field predictions 

In the case of simple field predictions (i.e. neither 16x8 or dual prime) the only further processing that 
may be required is to average forward and backward predictions in B-pictures. This shall be performed as 
specified for ^Trame predictions" in the previous clause. 

In the case of simple field prediction in a fi^e picture the predictions for chrominance components of 
4:2:0, 4:2:2 and 4:4:4 formats for each field shall be of size 8 samples by 4 lines, 8 samples by 8 lines and 
16 samples by 8 lines respectively. 

In the case of simple field prediction in a field picture the predictions for chrominance components of 
4:2:0, 4:2:2 and 4:4:4 formats for eadi field shall be of size 8 samples by 8 lines, 8 samples by 16 lines 
and 16 samples by 16 lines respectively. 



7.6.73 16x8 Motion compensation 

In this prediction mode separate predictions are formed for the upper 16x8 region of the macroblock and 
the lower 16x8 region of tiie macroblock. 

The predictions for chrominance components, for each 16x8 region, of 4:2:0, 4:2:2 and 4:4:4 formats 
shall be of size 8 samples by 4 lines, 8 samples by 8 lines and 1 6 samples by 8 lines respectively. 

7.6.7.4 Dual prime 

In dual prime mode two predictions are formed for each field in an analogous manner to the badcward 
and forward predictions in B-pictures. If pel j>red_same ^arity[y\{x\ is the prediction sample from the 
same parity field and pel_pred_ppposite_parity\y\[x\ is the corresponding sample firom the opposite parity 
field then the final prediction sample shall be formed as; 

pel^red\y]lx] = {pel_pred_fiame_parity\y][x] + pel_pred_ppposite_parity\y][x]y/2; 

In the case of dual prime prediction in a firame picture, the predictions for chrominance components of 
each field of 4:2:0, 4:2:2 and 4:4:4 formats shall be of size 8 samples by 4 lines, 8 samples by 8 lines and 
16 samples by 8 lines respectively. 



7.6.7.1 



Simple frame predictions 
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In the case of dual prime prediction in a field picture, the predictions for chrominance components of 
4:2:0, 4:2:2 and 4:4:4 fonnats shall be of size 8 samples by 8 lines, 8 samples by 16 lines and 16 samples 
by 16 lines respectively. 

7.6.8 Adding prediction and coefficient data 

The prediction blocks have been formed and reorganised into blocks of prediction samples \^ich 
match the field/frame structure used by the transform data blocks. 

The transform data y[y][jc] shall be added to the prediction data and saturated to form the final decoded 
samples <i[y][x] as follows; 

fcfr(y=0;y<S;y^){ 
for (jc=0; x<8; x4-f ) { 

d[y][x] =/ly]M+/7[y][x]; 
if(d\y][x]<0)d\y][x]^Ol 
if (^IM > 255) d[y][x] = 255; 

} 

} 
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7.7 Spatial scalability 

This clause specifies the additional decoding process required for the spatial scalable extoisions. 

Both the lower layer and the enhancement layer shall use the '^restricted slice structure" (no gaps between 
slices). 

Figure 7-13 is a diagram of the video decoding process with spatial scalability The diagram is simplified 
for clarity. 
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Figurie 7-13. Simplified motion compensation process for spatial scalability 



Recoi 



Mill 



endation ITU-T H.262 (1995 £) 



107 



ISO/mC 13818-2: 1995 (£) 



7.7.1 Higher syntactic structures 

In general the base layer of a spatial scalable hierarchy can conibrm to any coding standard including 
Recommendation ITU-T 11261, ISO/IEC 11 172-2 this specification. Note however, that within this 
specification the decodability of a spatial scalable hierarchy is only considered in the case that the base 
layer conforms to this specification or ISO/DBCl 1 172-2. 

Due to the "loose coupling" of layers only one syntactic restriction is needed in the enhancement layer if 
both lower and enhancement layer are interlaced. In that case picture_structure has to take the same value 
as in the reference firame used for prediction fi'om the lowo: layer. See 7.7.3.1 for how to identify this 
reference fiame. 

7.7^ Prediction in the enhancement layer 

A motion compensated temporal prediction is made fi-om reference fi^ames in the enhancement layer as 
described in 7.6. In addition, a spatial prediction is formed firom the lower layer decoded fiame 
(<)iower[y]M)» as described in 7.7.3. These predictions are selected individually or combined to form the 
actual prediction. 

In general up to four separate predictions are formed for each macroblock which are combined together to 
form the final prediction macroblock p[y][x]. 

In the case that a macroblock is not coded, either because the entire macroblock is skipped or the specific 
macroblock is not coded there is no coefiQcient data. In this case f[y][x] is zero and the decoded samples 
are simply the prediction, p[y][x]. 

7.73 Formation of spatial prediction 

Forming the spatial prediction requires identification of the correct reference fi'ame and definition of the 
spatial resampling process, which is done in the following clauses. 

The resampling process is defined for a vfhole iBrame, however, for decoding of a macroblock, oaly the 
16x16 region in the upsampled fi-ame, ^ch corresponds to the position of this macroblock, is needed. 

7.73.1 Selection of reference frame 

The spatial prediction is made firom the reconstructed fiame of the lower layer referenced by the 
lower_layer_temporal_reference. However, if lower and enhancement layer bitstreams are embedded in an 
Recommendation ITU-T H.220.0 | ISO/IEC 13818-1 (Systems) multiplex, this information is overridden 
by the timing information given by the decoding time stamps (UTS) in the PES headers. 

NOTE - If group_of_picturesJieaderO occurs often in the lower layer bitstream then the temporal 
reference in the lower layer may be ambiguous (because temporal^referrace is reset after a 
group_oLpictures_|ieadcrO). 

The reconstructed picture fi'om which the spatial prediction is made shall be one of the following: 

* The coincident or most recently decoded lower layer picture 

* The coinddoit or most recently decoded lower layer I-picture or P-picture 

* The second most recently decoded lower layer I-picture or P-picture provided that the lower layer 
does not have low.delay set to *1\ Note fiirthermore that spatial scalability will only work efBciendy 
when predictions are formed from frames in the lower layer which are alsfo coincident (or very close) in 
display time with the predicted firame in the enhancement layer. 
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7.73.2 Resampling process 

The spatial prediction is made by resampling the lower layer reconstructed frame to the same sample grid 
as the enhancement layer. This grid is defined in terms of frame coordinates, even if a lower-layer 
interlaced frame was actually coded with a pair of field pictures. 

This resampling process is illustrated in Figure 7-14. 



lower^layer _prediction_yerticaljsize * vertical_^ubsampling_Jactorjfi I 

vertical js^bsampling_Jactor_m 

lower_fayer_Jiorizontaf_offset 



loyver_iayer_verticai_offset 



Si s;, 

J. 2 




lower_layer _prediction_Jtorizontal jsize 



lower_fayer j}rediction_Jiorizontal_fiize * 
horizontal _jubsampling_Jactor_n I 
honzontalj5ubsampling_factor_m 



Figure 7-14. Formation of the ^spatial" prediction by interpolation of the lower layer picture 

Spatial predictions shall only be made for macroblocks in the enhancement layer that lie wholly within the 
upsampled lower layer reconstructed frame. 

The q;)sampling process depends on whether the lower layer reconstructed frame is interlaced or 
progressive, as indicated by lower Jayer_progressive_frame and whether the enhancement layer frame is 
interlaced or progressive, as indicated by progressive.frame. 

When lower_layer_progressive_frame is T, the lower layer reconstructed frame (renamed to prog_4)ic) is 
resampled vertically as described in 7.7.3.4. The resulting frame is considered to be progressive if 
progressive_frame is '1' and interlaced if progressive_frame is *0'. The resulting frame is resampled 
horizontally as described in 7.7.3.6. lower_)ayer_deinterlaced_field_select shall have the value U 

When lowerJayer_progressive_frame is *0' and prQgressive_frame is *0', each lower layer reconstructed 
field is deinterlaced as described in 7.7.3.4, to produce a progressive field (prog_pic). This field is 
resampled vertically as described in 7.7.3.5. The resulting field is resampled horizontally as described in 
7.7.3.6. Finally the resulting field is subsampled to produce an interlaced field. 
lower_layer_deinterlaced_field_select shall have the value '1*. 

When lowerJayer_progressive_fi:ame is '0' and progressive^frame is T, each lower layer reconstructed 
field is deinterlaced as described in 7.7.3.4, to produce a progressive field (prog_pic). Only one of these 
fields is required. When lower_layer_deinterlaced_field_select is *0' the top field is used, otherwise the 
bottom field is used. The one that is used is resampled vertically as described in 7.7.3.5. The resulting 
fi^e is resampled horizontally as described in 7.7.3.6. 
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For interlaced frames, if the current (and implicitly the lower-layra*) frame are encoded as field pictures, 
the deinterladng process described in 7.7.3.5 is done within the field. 

lowerJayer_yertical_ofifeet and lowerJayer_horizontal_ofiset, defining the position of the lower layer 
frame within the current frmie, shall be taken into account in the resampling definitions in 7.7.3.5 and 
7.7.3.6 respectively. The lower layer offiets are limited to even values when the chrominance in the 
enhancement layer is subsampled in that dimension in order to align the chrominance samples between 
the two layers. 

The upsampling process is summarised Table 7-15. 



Table 7-15 Upsampling process 



lower_layer_ 


lower_layer_ 


pr<^essive_ 


Apply 


Entity used 


deinterlaced_ 


progressive_frame 


frame 


deinterlace 


for prediction 


field_select 






process 




0 


0 


1 


yes 


top field 


1 


0 


1 


yes 


bottom field 


1 


1 


1 


no 


fi'ame 


1 


1 


0 


no 


frame 


1 


0 


0 


yes 


both fields 



7*7.3.3 Colour component processing 

Due to the different sampling grids of luminance and chrominance components, some variables used in 
7.7.3.4 to 7.7.3.6 take dilferent values for luminance and chrominance resampling. Furthermore it is 
permissible for the chrominance formats in the lower layer and the enhancement layer to be different from 
one another. 

The table 7-16 defines the values for the variables used in 7.7.3.4 to 7.7.3.6 
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Table 7-16 Local variables used in 7.733 to 7.73.5 



variable 


value for luminance processing 


value for chrominance processing 


ll_h_size 


lower Jayer_predictionJiorizontal_size 


lower_layer_j)rediction_horizQntal_si2e 
/ ciiroma_ratio_nonzontal power] 


lLv_size 


lower Jayer_prediction_vatical_size 


lower_layer_prediction_vertical_si2e 
/ chroma ratio vertical 11 nwerl 


11 h of&et 


lower laver^orizontal offset 


lower laver horizontal nff^set 

/ chroma_ratio_horizQntal[enliance] 


ll_v_oflfeet 


lowerJayer_vertical_offset 


lowCT_layer_vertical_ofl&et 

/ chroma_ratio_vertical[enhance] 


h_subs_in 


horizontal_subsanipling_factor_m 


horizontal_subsampling_factor_m 


h_subs_n 


hQrizQntaLsubsanipling.iactor_n 


horizQntal_subsampling.factQr_p 
* format_ratioJiorizontal 


v_subsjn 


vertical_subsampling.&ctor_ni 


vert]cal_subsampling_factor_jn 


v_subs_p 


vertical_subsampling_&ctor_n 


vertical_subsamplingJ[actQr_n 
* formatj^tio_vertical 



Tables 7-17 and 7-18 give additional definitions. 

Table 7-17 chrominance subsampling ratios for layer = {lower, enhance} 



chrominance format 


chroma_ratio_ 


chroma_ratio_ 


lower layer 


horizontal [layer] 


vertical Payer] 


4:2:0 


2 


2 


4:2:2 


2 


1 


4:4:4 


1 


1 



Table 7-18 chrominance format ratios 



chrominance format 


chrominance format 


format__ratio_ 


format_ratio_ 


lower layer 


enhancement layer 


horizontal 


vertical 


4:2:0 


4:2:0 


1 


1 


4:2:0 


4:2:2 


1 


2 


4:2:0 


4:4:4 


2 


2 


4:2:2 


4:2:2 


1 


1 


4:2:2 


4:4:4 


2 


1 


4:4:4 


4:4:4 


1 


1 
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7.73.4 Deinterlacing 

If deinterlacing needs not to be done (according to table 7-16), the lower layer reconstructed frame 
(^werM W) is renamed to input_j)ic. 

First, eadi lower layer field is padded with zeros to form a progressive grid at a fi'ame rate equal to the 
field rate of the lower layer, and with the same number of lines and samples per line as the lower layer 
fi'ame. Table 7-19 specifies the filters to be applied next The limiinance component is filtered using the 
relevant two field aperture filter if picture_structure — 'Trame-Picture" or else using the one field 
aperture filter . The chrominance component is filtered using the one field aperture filter. 

The temporal and vertical colunms of the table indicate the relative spatial and temporal coordinates of 
the samples to whidi the filter taps defined in the other two colunms apply. An intermediate sum is 
formed by adding the multiplied coefficients together. 



Table 7-19. Deinterlacing Filter 







two field aperture 


one field aperture 


Temporal 


Vertical 


Filter for first field 


Filter for second field 


Filter (both fields) 


-1 


-2 


0 


-1 


0 


-1 


0 


0 


2 


0 


-1 


2 


0 


-1 


0 


0 


-1 


g 


8 


8 


0 


0 


16 


16 


16 


0 


1 


8 


8 


8 


1 


-2 


-1 


0 


0 


1 


0 


2 


0 


0 


1 


+2 


-1 


0 


0 



The output of the filter (sum) is then scaled according to the following formula: 

pro&4)ic[y][xl = sum // 16 
and saturated to lie in the range [0:255]. 

The filter aperture can extend outside the coded picture size. In this case the samples of the lines outside 
the active picture shall take the value of the closest neighbouring existing sample (below or above) of the 
same field as defined below. 

Far all samples [y][x]: 
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if(y<0&&(y&l = l)) 

if (y<0 && (y&l = 0)) 
y=0 

if (y >= ll_y_si2e && 

( (y-lLv_size)&l = 1)) 
y = ll_v_size - 1 
' if(y>=ll_v_si2e&4& 

(()^lLv.size)&l = 0)) 
y = ll_y_size - 2 



Vertical resampling 

The firame subject to vertical resampling, prog_pic, is resampled to the enhancement layer vertical 
sampling grid using linear interpolation between the sample sites according to the following formula, 
^ere vert^pic is the resulting field: 

vert_pic[yjj -f lLv_ofiset][x] = (16 - phase) ♦ progj)ic[yl][x] -i- phase * prog_pic[y2][x] 

v/hert yj|+ ILv.ofiset = output sample coordinate in vert^ic 

yl = Oil ♦ y-subs_m) / v_subs_n 

y2 = yl + 1 if yl< lLv_si2e - 1 

yl otherwise 
phase = (16 * {( yji * v_subs_m) % v_subs_n)) // v_subs_n 

Samples which lie outside the lower layer reconstructed frame which are required for upsampling are 
obtained by border extension of the lower layer reconstructed firame. 

NOTE - The calculation of phase assumes that the sample position in the enhancement layer at 
yjj = 0 is spatially coincident with the first sample position of the lower layer. It is recognised 

that this is an approximation for the chrominance component if the chroma_format = 4:2:0. 
7*7 J.6 Horizontal resampling 

The frame subject to horizontal resampling, vert_pic, is resampled to the enhancement layer horizontal 
sampling grid using linear interpolation between the sample sites according to the following formula, 
where hQr__pic is the resulting field: 

hor_pic[y][xij + ll_h_oflfeet] = ((16 - phase) * vert_pic[y][xl] + phase * vert^ic[y][x2]) // 256 

who'e x]|+ 11 Ji_o£fset - output sample coordinate in hor^pic 

xl = (xjj * h_subs_m) / h_subs_n 

x2 = xl+1 if xl< U.h^size - 1 

xl otherwise 
phase = (1 6 * (( xji ♦ h_subs_pi) % h^subsja)) // h_subs_n 
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Samples which lie outside the lower layer reconstructed frame \^ch are required for i^sampling are 
obtained by border extension of the lower layer reconstructed frame. 

7.73.7 Reinterladng 

If reinterladng needs not to be done, the result of the resampling process, hor^pic, is renamed to 
spat_j)red_j)ic. 

If hor_j}ic was derived from the top field of a lower layer interlaced frame, the even lines of hor_pic are 
copied to the even lines of spat_pred_pic. 

If hor_pic was derived from the bottom field of a lower layer interlaced frame the odd lines of hor_pic are 
copied to the odd lines of spat^pred jpic. 

If hor^ic was derived from a lower layer progressive frame, hor_j)ic is cq)ied to spat_pred_pic. 
7.7.4 Selection and combination of spatial and temporal predictions 

The spatial and temporal predictions can be selected or combined to form the actual prediction. The 
macrobl6ck_type (Tables B-5, B-6 and B-7) ) and the additional spatial_temporal_weight_code (Table 7- 
21)_indicate, by use of the spatia]_temporal_weight_class, whether the prediction is tempcnral-only, 
spatial-only or a weighted combination of temporal and spatial predictions. Classes are defined in the 
following way: 

Class 0 indicates temporal-only prediction 

Class 1 indicates that neither field has spatial-only prediction 

Class 2 indicates that the top field is spatial-only prediction 

Class 3 indicates that the bottom field is spatial-only prediction 

Class 4 indicates spatial-only prediction 

In intra pictures, if spatial_temporal_weight_class is 0, normal intra coding is performed, otherwise the 
prediction is spatial-only. In predicted and interpolated pictures, if the spatial__.temporal_weight_class is 0, 
prediction is temporal-only, if the spatial_tempOTal_weight_class is 4, prediction is spatial-only, otherwise 
one or a pair of prediction weights is used to combine the spatial and temporal predictions. 

The possible spatial_temporal_weights are given in a weight table which is selected in the picture spatial 
scalable extension. Up to four different weight tables are available for use depending on whether the 
current and lower layers are interlaced or progressive, as indicated in Table 7-20 (allowed, yet not 
recommended values given in brackets). 
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Table 7-20. Intended (allowed) spatial_temporal_weight_code_tableJndex values 



LiOwer laver format 


Enhancement 
layer format 


snatial temooral weight 
code.tablejndex 


Progressiye or interlaced 


Progressive 


00 


Progressive coincident with enhancement layer 

top fields 


Interlaced 


10 (00; 01; 11) 


Progressive coincident with enhancement layer 

from bottom fields 


Interlaced 


01 (00; 10; 11) 


Interlaced (picture_structure = Frame-Picture) 


Interlaced 


00 or 11 (01; 10) 


Interlaced (picture^structure != Frame-Picture) 


Interlaced 


00 



In macroblodc_modesO) ^ two bit code, spatial_tempQral_weight_code, is used to describe the prediction 
for each field (or frame), as shown in the Table 7-21. In this table spatial_temporal_integer_weight 
identifies those spatial_temporal_weight_codes that can also be used with dual prime prediction (see 
tables 7-22, 7-23). 



Table 7-21 spatial_temporal_weights and spatiaI_temporal_weight_classes for the 
spatial_temporal_weight_code_tableJndex and spatial_temporal_weight_codes 



spatial_temporal_ 
weight_code_table_ 
index 


spatial^ 
temporal. 

weight_code 


spatia]_ 
temporal., 
weight (s) 


spatial, 
temporal. 

weight class 


spatial, 
temporal. 

integer_weight 


00* 




(0,5) 


1 


0 


01 


00 


(0;i) 


3 


1 




01 


(0; 0,5) 


1 


0 




10 


(0,5; 1 ) 


3 


0 




11 


(0,5; 0,5) 


1 


0 


10 


00 


(i;0) 


2 


1 




01 


(0,5; 0) 


1 


0 




10 


(1;0,5) 


2 


0 




11 


(0,5; 0,5) 


1 


0 


11 


00 


(i;0) 


2 


1 




01 


(i;0,5) 


2 


0 




10 


(0,5; 1 ) 


3 


0 




11 


(0,5; 0,5) 


1 


0 


* For spatial_temporal_weight_code_table_index 
spatial_temporal_weight_code is transmitted 


— 00 no 





NOTE- Spatial-only prediction (weight.class = 4) is signalled by different values of 
macroblockjtype (see tables B-5 to B-7). 

When the spatial.temporal.weight combination is given in the form (a; b), "a** gives the proportion of the 
prediction for the top field ^ch is derived from the spatial prediction and gives die proportion of 
the prediction for the bottom field which is derived from the spatial prediction for that field. 
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When the spatiaLteniporal__weight is given in the fonn (a), "a" gives the proportian of the prediction for 
the picture which is derived from the spatial prediction for that picture. 

The precise method for predictor calculation is as follows: 

pel_pred_temp[y][x] is used to denote the temporal prediction (formed within the enhancement layer) as 
defined for pel_pred[y][x] in 7.6. pel_pred_spat[y][x] is used to denote the prediction formed from the 
lower layer by extracting the appropriate samples, co-located with the current macroblock position, from 
spat_pred_pic. 

If the spatial_temporal_weight is zero then no prediction is made from the lower layer. Therefore; 

pclj)red[y][x] = pel_pred_temp[y][x]; 
If the spatial_temporal.weight is one then no prediction is made from the enhancement layer. Therefore; 

pel^red[y][x] =pel_pred_spat[y][x]; 

If the weight is one half then the prediction is the average of the temporal and spatial predicticms. 
Therefore; 

pel_pred[y][x] = (pel_pred_temp[y][x] + pel_pred_spat[y][x])//2; 

When progressive_frame = 0 chrominance is treated as interlaced, that is, the first weight is used for the 
top field chrominance lines and the second weight is used for the bottom field chrominance lines. 



Addition of prediction and coef&cient data is then done as in 7.6.8. 

7.7.5 Updating motion vector predictors and motion vector selection 

In frame pictures ^ere field prediction is used the possibility exists that one of the fields is predicted 
using spatial-only prediction. In this case no motion vector is present in the bitstream for the field vAnch 
has spatial-only prediction. For the case where both fields of a frame have spatial-only prediction, the 
macroblock.type is sudi that no motion vectors are present in the bitstream for that macroblock. 

The spatial_temporaLweight_class also indicates the number of motion vectors \^ch are 
present in the coded bitstream and how the motion vector predictors are updated as 
defined in Table 7-22 and Table 7-23. 
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Table 7-22. Updating of motion vector predictors in Field Pictures 



frame_niotion_type 




macroblock_motion_forward 






macroblock_motion_backward 








macroblock_intra 










spatial_temporal_weight_class 












Predictors to update 


Field-based^ 


- 




1 


0 


PMF[1][0][1:0] =iW[01[0][l: 


:0]0 


Field-based 


1 


1 


0 


0 


iW[l][0][l:0] =PAfV[0][0][l. 


0] 












PA/F[1][1][1:0] =PMV[0][l][\: 


0] 


Field-based 


1 


0 


0 


0,1 


PMV[\][O][l:0] =/W[01[0][l: 


0] 


Field-based 


0 


1 


0 


0,1 


PMF[1][1][1:0] =PMF[0][1][1: 


0] 


Field-basedt 


0 


0 


0 


0,1,4 


PMV[r][s]lt] = 0 § 




16x8 MC 


1 


1 


0 


0 


(none) 




16x8 MC 


1 


0 


0 


0,1 


(none) 




16x8 MC 


0 


1 


0 


0,1 


(none) 




Dual prime 


1 


0 


0 


0 


iWIl][0][l:0] =PAfF[0][0][l:0] 


NOTE-iW[r][j][l:0] 


= PMV[u][v][l :0] means that; 






PAfV[r][s][l] = PMyiu][v][l] mdPMnr][s][O]=PMy[u][v][0] 


^ If conceaIment_motion_vectors is zero then i'A/F[r][j][/] is set to zero (for all r, s and /). 


t field_motion_type is not present in the bitstream but is assumed to be Field-based 


§ PMV[r][s][t] is set to zero (for all r, s and 0- See 7.63.4. 
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Table 7-23. Updating of motion vector predictors in Frame Pictures 



franie_motion_type 





niacrobIocliL.motion.forward 






macroblock_motion_backward 








macroblockjntra 










spatial.temporal_weight.class 












Predictors to update 


Frame-based^ 


- 


- 


1 


0 


iW[ll[0][l:0] =iW[0][0][l:0]^ 


Frame-based 


1 


1 


0 


0 


/W[1][0][1:0] =PAf^O][0][l:0] 
PMV[l][\][\:0] -PMF[0][1][1:0] 


Frame-based 


1 


0 


0 


0,1,2,3 


iW[l][0][l:0] = /'A/F[0][0][l:0] 


Frame-based 


0 


1 


0 


0,1,2,3 


iW[l][l][l:0] =iW[0][l][l:0] 


Frame-basedJ 


0 


0 


0 


0,1,2,3,4 


PA/F[r][s][t] = 0§ 


Field-based 


1 


1 


0 


0 


(none) 


Field-based 


1 


0 


0 


0,1 


(none) 


Field-based 


1 


0 


0 


2 


PMfTl][0][l:0] =/W[0][0][l:0] 


Field-based 


1 


0 


0 


3 


PMF[1][0][1:0] =PA/;^0][0][1:0] 


Field-based 


0 


1 


0 


0,1 


(none) 


Field-based 


0 


1 


0 


2 


PMV[\][l][\:0] =/W[0][l][l:0] 


Field-based 


0 


1 


0 


3 


iW[l][ll[l:0] =/W[0][l][l:0] 


Dual priine@ 


1 


0 


0 


0,2,3 


PMV[mO]U'0]=P\fy[0][0][\:0] 



NOTE - PMV[r][s]ll:0]=PMV[u][v][l:0] means that; 

PAfy[r][s][l]-^PMV[u][v][l] and/W[r][j][0] =PMF[w][v][0] 



^ If concealment_motion.vectors is zero then i'A/F[r][f][/] is set to zero (for all r, s and /). 

t frame_motion_type is not present in the bitstream but is assumed to be Frame-based 

§ PMV[r][s][t] is srt to zero (for all r, s and 0- See 7.6.3.4. 

@ Dual prime can not be used v^en spatial_temporal_integer_weight = *0'. 



7.7.5.1 Resetting motion vector predictors 

In addition to the cases identiiSed in 7.6.3.4 the motion vector predictors shall be reset in the following 
cases; 

• In a P-picture when a macroblock is purely spatially predicted 

($patial_temporal_weigbt_class = 4) 

In a B-picture vihen a macroblock is purely spatially predicted 
(spatial_temporal_weight_class = 4) 

NOTE - In case of spatial_temporaLweight_class — 2 in a frame picture when field-based prediction 
is used, the transmitted vector is applied for the bottom field (see Table 7-25). However this 
vector[0][s][l:0] is predicted from PMV[0][s][l:0] . PMV[l][s][l:0] is then updated as 
shown in Table 7-23. 
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Table 7-24. Predictions and motion vectors in field pictures 



field.motion.type 




macrobIock_motion_forward 






niacroblock_motion__backward 








niacroblock_intra 










spatia]_temporal_weight_class 












Motion vector 


Prediction formed for 


Fieid-basea+ 




- 


1 


0 


vector \Q] [OJ [ 1 :0j^ 


Ncwie Amotion vector i<! for 












concealment) 


Field-based 


1 
1 


1 


0 


0 


vector\0][0][\:0] 


^ole field, forward 












vcc/or 10][1][1:0] 


whole field, badcward 


Field-based 


1 
1 


0 


0 


0,1 


vector [0][0][l:0] 


whole field, forward 


Field-based 


n 


1 


0 


0,1 


vectorlO][l][l:0] 


whole field, badcward 


Field-basedt 


0 


0 


0 


0.1.4 


vector[0][0][l:Of^ 


whole field, forward 


16x8 MC 


1 

M 


1 


0 


0 


vcc/orlO][0][l:0] 


xxppcr 16x8 field, forward 












vec/or tl][0][l:0] 


lower 16x8 field, forward 












vector I0][1][1:0] 


upper 16x8 field, backward 












vector ll][l][l:0] 


lower 16x8 field, backward 


16x8 MC 


1 


0 


0 


0.1 


vector 10][0][1:0] 


upper 16x8 field, forward 












vectortl][0][l:0] 


lower 16x8 field, forward 


16x8 MC 


0 


1 


0 


0,1 


vectorIO][l][l:0] 


upper 16x8 field, backward 












vectorll][l][l:0] 


lower 16x8 field, backward 


Dual prime 


1 


0 


0 


0 


vector I0][0][1:0] 


whole field, same parity, forward 












vector I2][0][l:0]*t 


whole field, opposite parity, forward 


NOTE - Motion vectors are listed in the order they appear in the bitstream 


^ the motion vector is only present if concealment_motion_vectors is one 


t field.motion.type is not present in the bitstream but is assumed to be Field-based 


* TTi^CA mrtfir\r» \tP^trKre ar^ Tint nr^cpnt- iri tVi*» Kitd-r^kom 




t These motion vectors are derived from vector '[0][0][1 :0] as described in 7.6.3.6 


§ The motion vector is taken to be (0; 0) as explained in 7.6.3.5 



Recommendation ITU-T H.262 (1995 £) 



am 



119 



ISO/IEC 13818-2: 1995 (E) 



Table 7-25. Predictions and motion vectors in frame pictures 



frame_motion_type 




macroblock_motion_forward 




macroblock_motion_backward 




macroblock.intra 




spatiaMemporal_weight_class 




Motion vector 


Prediction formed for 


Frame-based'*' 




- 


1 


0 


vector[0][0][\:QY 


Nrmft ^TnAtinn vwtnr i^ frir 












concealment) 


Frame-based 


1 


1 


0 


0 


vector[0][0][l:0] 


frame, forward 












vector[0][l][l:0] 


fimne, backward 


Frame-based 


1 
1 


0 


0 


0,1^3 


vector[0][0][l:0] 


frame, forward 


Frame-based 


A 
U 


1 


0 


0,1,2,3 


vector I0][1][1:0] 


frame, backward 


Fram&-Daseu^ 


A 
U 


0 


0 


0,1.2,3,4 


vector TO! FOl F 1 • 01*§ 


frame, fimvard 


F leiQ- oaseu 


1 


1 


0 


0 


vprtnr \ 01 FOl F 1 • 01 














v^rtnr Til FOl F 1 • 01 


lAJilUiil 11 VlU, lill Wdl u 












vector TOl F 11 F 1 * 01 


trm field hsiclcwiird 












vector T1][11F1:0] 


bottom field, backward 




1 


0 


0 


0,1 


vector 10][0][1:0] 


tq) field, forward 












vector ll][0][l:0] 


bottom field, forward 


rieia-oaseQ 


1 


0 


0 


2 




top field, spatial 












vector [U J j L i . u j 


oottom neio, lorwaru 


Field-based 


1 


0 


0 


3 


vector 10][0][1:01 


top field, forward 














bottom field, spatial 


Field-based 


0 


1 


0 


0.1 


vector 10][1][1:0] 


top field, backward 












vector i;i][l][l:0] 


bottom field, backward 


i*ield-based 


0 


1 


0 


2 




top field, spatial 












vector 10][1][1:0] 


bottom field, backward 


Field-based 


0 


1 


0 


3 


vector 10][1][1:0] 


top field, backward 














bottom field, spatial 


Dual prime® 


1 


0 


0 


0A3 


vector 10][0][1:0] 


top field, same parity, forward 












vector [0][0][1:0]* 


bottom field, same parity, forward 












vector [2][0][l:0]*t 


top field, opposite parity, forward 












vector I3][0][l:0]*t 


bottom fid., opposite parity, forward 
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NOTE 



0 




t 

§ 

@ 




7.7.6 



Skipped macroblocks 



In all cases, a skipped macroblock is the result of a prediction only, and all the DOT coefficients are 
considered to be zero. 

If sequence.scalable_extension is present and scalable_jnode = '^spatial scalability^', the following rules 
apply in addition to those given in 7.6.6. 

In I-pictures, skipped macroblocks are allowed. These are defined as spatial-only predicted. 

In P-pictures and B-pictures, the skipped macroblock is temporal-only predicted. 

In B-pictures a skipped macroblock shall not follow a spatial-only predicted macroblock. 

7.7.7 VBV buffer underflow in the lower layer 

In the case of spatial scalability, VBV buffer underflow in the lower layer may cause problems. This is 
because of possible uncertainty in precisely which frames will be repeated by a particular decode. 
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SNR scalability 
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QFS[n] 
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Decoding 



QF[v][u] 




QF[v][u] 



F'lower[v][u] 
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/ 


Quantis- 




ation 




Arithmetic 
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Quantis- 




ation 




Arithmetic 






F'[v][u] 



F*enhance[v][u 



1 



Saturation 



Frame- 
store 
Memory 



Flv][u\ 



F[vM 



Mismatch 
Control 






Motion 
Compen- 
sation 



Decoded 
samples 



J\y][x] d\y][x] 
Figure 7-15. Illustratioii of decoding process for SNR scalability 



This clause describes the additional decoding process required for the SNR scalable extensions. 

SNR scalability defines a mechanism to refine the DCT coefficients encoded in another (lower) lay^ of a 
scalable hierarchy. As illustrated in Figure 7-15 data fi-om two bitstreams is combined after the inverse 
quantisation processes by adding the DCT coefficients. Until the data is combined, the decoding processes 
of the two layers are independent of one another. 

7.8.1 defines how to identify these bitstreams in a scalable hierarchy, however they can be classified as 
follows. 

The lower layer, derived fi-om the first bitstream, can itself be either non-scalable, or require the spatial or 
temporal scalability decoding process (and hence the decoding of additional bitstreams) to be applied. 
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The enhancement layer, derived from the second bitstream, contains mainly coded DCT coefBcients and a 
small overhead. The decoding process for this layer and the combination of the two layers are described in 
this clause. 

NOTE- All information regarding prediction is contained in the lower layer bitstream only. 

Therefore it is not possible to reconstruct an enhancement layer without decoding the lower 
layer bitstream data in parallel. 

Furthermore prediction and reconstruction of the pictures as described in 7.6, 7.7 and 7.9 for the 
combined lower and enhancement layer is identical to the respective steps for decoding of the lower layer 
bitstream only. 

Semantics and decoding process described in this clause include a mechanism for "chroma simulcast^. 
This may be used (for instance) to enhance 4:2:0 in the lower layer to 4:2:2 after processing the 
enhancement layer data. While the Ixmiinance data is processed as described before, in this case the 
chrominance information retrieved from the lower layer bitstream (with exception of intra-DC values, see 
7.8.3.4) shall be discarded and replaced by the new information with higher chrominance resolution 
decoded from the enhancement layer. 

It is inherent in SNR scalability that the two layers are very tighUy coupled to one another. It is a 
requirement that corresponding pictures in each layer shall be decoded at the same time as one another. 

In the case that the lower layer bitstream conforms to ISO/IEC 1 1 172-2 (and not this specification) then 
two different IDCT mismatch control schemes are being used in decoding. Care must be taken in the 
encoder to take account of this. 

7.8.1 Higher syntactic structures 

The two bitstreams layers in this clause are identified by their layer_id, decoded from the 
sequence_scalable_extension. 

The two bitstreams shall have consecutive layer ids, with enhancement layer bitstream having 
layer Jd = i^eDhsncQ ^® lower layer bitstream having layer_id = idenhance"^- 

The syntax and semantics of the enhancement layer are as defined in 6.2 and 6.3, respectively. 

In the case that the lower layer bitstream conforms to ISO/EEC 1 1 172-2 (and not this specification) then 
both this lower and the enhancement layer shall use the '"restricted slice structure" defined in this 
specification. 

Semantic restrictions apply to several values in the headers and extensions of the enhancement layer as 
follows: 

Sequence header 

This header shall be identical to the one in the lower layer bitstream except for the values of bit_rate, 
vbv_bu£fer-Size, load^intra.quantiser^poatrix, intra_quantiser_matrix, load_pon^intra_quantiser^atrix 
and nQn_intra_quantiser_matrix. These can be selected independoitly except for 
load_intra_quantiser_matrix \\iiich shall be zero. 

Sequence extension 

This extension shall be identical to the one in the lower layer bitstream excq)t for the values of 
prbfile.and Jevel_indication, chroma.format, bit_rate_extension and vbv_bufifer„size.extaision. Those 
can be selected independentiy. 

A different value of chroma_format in each layer will cause the chroma^simulcast flag to be set as 
specified by Table 7-26. 
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The chToma.fbnnat of the eahancement layer shall be higher or equal to the chroma^format of the lower 
layer bitstream. 



Table 7-26 chroma.simulcast flag 



chroma.format 


chroma_format 


chroma_simiilcast 


(lower layer) 


(enhancement layer) 




4:2:0 


4:2:0 


0 


4:2:0 


4:2:2 


1 


4:2:0 


4:4:4 


1 


4:2:2 


4:2:2 


0 


4:2:2 


4:4:4 


1 


4:4:4 


4:4:4 


0 



In the case that the lower layer bitstream conforms to ISO/IEC 11172-2 (and not this spedficatiim), 
sequence_extensionO is not present in the lower layer bitstream, and the following values shall be 
assumed for the decoding process. 



progressive_sequence 


= 1 


chroma_format 


= "4:2:0 


horizontal_size_extension 


= 0 


vertical_size_extension 


= 0 


bit_rate_extension 


= 0 


vbv_buffer_si2e_extension 


= 0 


low_delay 


= 0 


frame_rate_extension__n 


= 0 


frame_rate_extension_d 


= 0 



The sequence.extensionO in the enhanoment layer shall have the values shown above. 
Sequence display extension 

This extension shall not be present as there is no separate display process for the enhancement layer. 
Sequence scalable extension 

This extension shall be present with scalable ^ode = "SNR scalability". 
GOP header 

This header shall be identical to the one in the lower layer bitstream. 

NOTE - The GOP header must be present in each layer in order that the temporal_reference in each 
layer are reset on the same frame. 

Picture header 

This header shall be identical to the one in the lower layer bitstream except for the value of vbv_delay. 
This can be selected independently. 
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Picture coding extension 

This extension shall be identical to the one in the lower layer bitstream except for the value of 

ci_scale_type and altemate_scan. These can be selected independently. 

chroma_420__type shall be set to *0* if chroma_simulcast is set. Else it shall have the same value as in the 
lower layer bitstream. 

In the case that the lower layer bitstream conforms to ISO/IEC 1 1172-2 (and not this specification) then 
picture_coding_extensionO is not present in the lower layer bitstream and the following values shall be 
assumed for the decodiog process: 

f_code[0][0] = forward_jLcode in the lower layer bitstream or 15 

f_oode[0][l] = forward_f_code in the lower layer bitstream or 15 

f_code[l][0] = backward_f_code in the lower layer bitstream or 15 

f_code[l][l] = backward_f_code in the lower layer bitstream or 15 

intra_dc_precision = 0 

pictur€_structure = 'Trame Picture" 

top_field__first = 0 

fi^ejred_frame_dct = 1 

concealment_motion_vectors == 0 

intra_vlc_format = 0 

repeat_first_field = 0 

chroma_420Ltype = 1 

progressive_frame = 1 

composite_display_flag = 0 

The picture_coding_extensi(mO in the enhancement layer shall have the values shown above. 
For the lower layer q_scale_type and altemate_scan shall be assumed to have the value zero. 
NOTE - q_scale_type and altemate.scan can be set independently in the enhancement layer. 

Quant matrix extension 

This extension is optional. Semantics are described in 6.3.1 1. 

load_intra_quantiser_matrix and load_chroma_intra_quantiser_matrix shall both be zero. 
Note Only the non-intra matrices will be used in the subsequent decoding process. 

Picture display extension 

This extension shall not be present. 

NOTE - There is no separate display process for the enhancement layer. If pan-scan functionality is 
desired it can be accomplished already by using the information conveyed by the pan-scan 
ext^ion of the lower layer bitstrt 



Slice header 

Slices shall be coincident with those in the lower layer. The value of quantiser_scaIe_code can be set 
independently from the lower layer bitstream. 

7.8.2 Macroblock 

Subsequently the "current macroblock" denotes the currently processed macroblock. The "current 
macroblock of the lower layer" denotes the macroblock identified by having the same macroblock^address 
as the current macroblock. 
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The decoding of the macroblock header infonnatioii is done according to semantics in 6.3.17. 

NOTE- Table B-8 ^ch is used if scalable.mode = "SNR scalabilit/* will never set the 
macroblock_intra, macrobIock^motion_forward or macroblock_motion_baclcward flags, 
since a macroblock in the enhancement layer contains only refinement data for the current 
macroblock of the lower layer. 

However the corresponding syntax elements and flags of the current macroblock in the lower 

layer bitstream are relevant for the combined decoding process of lower and enhancement 
layer following the inverse DCT as described in 7.8.3.5. 

7.8.2.1 dct_type 

The syntax element dct_t>pe may be present in none, one or both of the lower and enhancement layer 
macroblock^modesQ, ^ indicated by the semantics in 6.3.17. 

If dct_type is present in the macroblock.modesQ in both layers it shall have identical values. 

7.8.2.2 Skipped Macroblocks 

Macroblocks can be skipped in the enhancement layer bitstream, meaning that no coefficient 
enhancement is done (^e„AaffceMt"]^» ^» Regarding this, die decoding process detailed in 

7.8.3 shall be applied. 

When maaoblocks are skipped in both, the lower and the enhancement layer bitstreams, the decoding 
process is exactly as specified in 7.6.6. 

Macroblocks can also be skipped in the lower layer bitstream, ^ile still being coded in the enhancement 
layer bitstream. In that case the decoding process detailed in the following has to be applied, but 
^lowerMM = 0, for all V, u. 

7.8J Block 

The first part of the decoding process of the enhancement layer block is independent from the lower layer. 

The second part of the decoding process of the enhancement layer block has to be done jointly with the 
decoding process of the coincident lower layer block. 

Two sets of invCTse quantised coefficients F"]ower ^"enhance added to form F" (see Figure 7-15). 
F"lower is derived from the lower layer bitstream exactly as defined in 7.1 to 7.4.2.3. 
F'enhance is derived as is defined in the clauses below. 

The resulting F" is further processed, starting with saturation, as defined in 7.4.3 to 7.6 (7.7, 7.9) 

7.83.1 Variable length decoding 

In an enhancem^t layer block the VLC decoding shall be performed according to 7.2., as for a non-intra 
block (as indicated by macroblock^intra - 0). 

7.83.2 Inverse scan 

Inverse scan shall be done exactly as defined in 7.3 
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7.833 Inverse quantisation 

In an enhancement layer block the inverse quantisation shall be performed according to 7.4.2 as for a non- 
intra blodc 

In the case that the lower layer bitstream conforms to ISO/TEC 1 1 172-2 (and not this specification) then 
the 'inverse quantisation arithmetic" used to derive F"]owerN[u] (see Figure 7-14) shall include the 

IDCT mismatch control (oddification) and saturation specified in ISO/IEC 1 1 172-2. 
7.83.4 Addition of coefficients from the two layers 

Corresponding coeifficients fi-om the blocks of each layer shall be added together to form F" (see Figure 7- 
15). 

F\vm - Wrlv]M + WeMM, for all m, v 

If chroma^simulcast = 1 is set only the luminance blocks are treated as described above. 

For chrominance blocks the DC coefficient of the base layor is used as a prediction of the DC coefficient 
in the coincident block in the enhancement layer, whereas the AC coefficients of the base layer are 
discarded and AC coefficients of the enhancement layer form F" in Figure 7-14 according to the following 
formulae: 

F10][0] = F"ioy^mm + F"enhancemm 

F"[v][m] = F"enhance[^W\^ for all m, v excq)t m - v ~ 0 

NOTE- Chroma simulcast blocks are inverse quantised like non-intra blocks and use the 
chrominance non-intra matrix. 

Table 7-27 gives the index of the chrominance block A^ose DC coefficient (F'7aM«r[0][0]) is to be used to 
predict the DC coefficient in the coincident chrominance block of the enhancement layer 
{F^'enhancemmr 

Table 7-27. block index used to predict DC coefficient 











block index 








cliroma_fonnat 


4 


5 


6 


7 


8 


9 


10 


11 


base: 4:2:0 


4 


5 


4 


5 










upper: 4:2:2 


















base: 4:2:0 


4 


5 


4 


5 


4 


5 


4 


5 


upper: 4:4:4 


















base: 4:2:2 


4 


5 


6 


7 


4 


5 


6 


7 


upper: 4:4:4 



















7.83.5 Remaining macroblock decoding steps 

After addition of coefficients from the two layers, the remainder of the macroblock decoding steps is 
exacdy as described in 7.4.3 to 7.6 (7.7, 7.9, if applicable), since diere is now only one data stream 
F"[v][u] to be processed. 
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In this process, the spatio/temporal prediction p[y][x] is derived according to the macroblock type syntax 
elements and flags for the current macroblock known from the lower layer bitstream. 
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7.9 Temporal scalability 

Temporal scalability involves two layers, a lower layer and an enhancement layer. Both the lower and the 
oihancement layers process the same spatial resolution. The enhancement lay^ oihances the temporal ' 
resolution of the lower layer and if temporally remultiplexed with the lower layer provides full temporal 
rate. This is the frame rate indicated in the enhancement layer. The decoding process for enhancement 
layer pictures is similar to the normal decoding process described in 7. 1 to 7.6. The only difference is in 
the '^Prediction field and frame selection" described in 7.6.2. 

The reference frames for prediction are selected by reference_select_code as described in Tables 7-28 and 
7-29. In P pictures, the forward reference picture can be one of the following three: most recent 
enhancement picture, most recent lower layer frame, or next lower layer frame in display order. Note that 
in the latter case, the reference frame in lower layer used for prediction is backward in time. 

In B-pictures, the forward reference can be one of the following two: most recent the enhancement 
pictures or most recent (or temporally coincident) lower layer frame wdiereas the backward reference can 
be one of the following two: most recent lower layer picture including temporally coincident picture in 
display order or next lower layor frame in display order. Note that in this case, the backward reference 
frame in lower layer used for prediction is forward in time. 

Backward prediction cannot be made from a picture in the enhancement layer. This avoids the need fcn* 
frame reordering in the enhancement layer. Motion compensation process forms predictions using lower 
layer decoded pictures and/or previous temporal prediction from the enhancement layer. 

The enhancement layer can contain I-pictures, P-pictures or B-pictures, but B-pictures in enhancement 
layer behave more like P-pictures in the sense that a decoded B-picture can be used to predict the 
following P-pictures or B-pictures in the enhancement layer. 

When the most recent frame in the lower layer is used as the reference, this includes the frame that is 
temporally coincident with the frame or the first field (in case of field pictures) in the enhancement layer. 
The prediction references used for P-picture and B-pictures are shown in Table 7-28 and Table 7-29 
respectively. 

The lower and enhancement layers shall iise the restricted slice structure. 



Table 7-28 Prediction references selection in P-pictures 



reference_select_code 


forward prediction reference 


00 
01 
10 

11 


Most reccat decoded enhancement picture(s) 
Most recent lower layer frame in display order 
Next lower layer frame in display order 
forbidden 
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Table 7-29 Prediction references selection in B-pictures 



reference^ 
select_ 


forward prediction reference 


backward prediction reference 


code 






00 


foibidden 


forbidden 


01 


Most recent decoded enhancement 
picture(s) 


Most recent lower layer picture in display 
order 


10 


Most recent decoded enhancement 
picture(s) 


Next lower layer picture in display order 


11 


Most recent lower layer picture in display 
ordo* 


Next lower layer picture in display order 



Figure 7-16 shows a simplified diagram of the motion compensation process for the enhancement layer 
using temporal scalability. 
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Figure 7-16 Simplified motion compensation process for tlie enhancement layer using temporal 

scalability. 

I-pictures do not use prediction references; to indicate this, the reference_select_code for I-pictures shall 
be*ir. 

Depending on picture_coding_type, ^en forward_temporal_reference or backwardjtemporal^ference 
do not imply references to be used for prediction, they shall take the value 0. 



7.9.1 



Higher syntactic structures 



sequence.scalable^extension. 

The two bitstreams shall have consecutive layer ids, with enhancement layer having IayBr_id=idenhance 
and the lower layer having layer Jd=idenhance"l- 
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The syntax and semantics of enhancement layers are as defined in Qauses 6.2 and 6.3 respectively. 

Semantic restrictions apply to several values in the headers and extensions of the enhancemrat layer as 
follows. 

The lower layer shall conform to this specification (and not to ISO/lEC 1 1 172-2). 
Sequence header 

The values in this header can be different from the lower layer excq)t for horizontal.size_value, 
vertical_size_value and aspect^tio.information. 

Sequence extension 

This extension shall be identical to the one in the lower layer except for values of 
profile_and_level_indication, bit_rate_extension, vbv__bufiFer_size_extension, low_delay, 
fi'ame_rate_extensibn_n and firame^te.extension.d These can be selected independently. Note that 
progressive_sequence indicates the scanning format of the enhancement layer fi*ames only rather than of 
the output fi-ames after multiplexing. The latter is indicated by mux_tojrogressive_sequence (see 
sequence scalable extension). 

Sequence display extension 

This extoision shall not be present as there is no separate display process for the oihancement layer. 
Sequence scalable extension 

This extension shall be present with scalable.mode = 'Temporal scalability". 

When progressive_sequence=0 and mux_to^rogressive_sequenceF=0, top_field_first and 
picture_mux_factQr can be selected. 

When progressive_sequence=0 and mux_to_progressive_sequence=l, top_field_first shall contain a 
complement of the value of top_fieId.first of the lower layer but picture^ux^&ctor shall be 1. 

When progressive_sequence=l and mux^to_progressive_sequence=l, top_field_first shall be zero but 
picture.mux_iactor can be selected. 

The combination of progressive_sequenceF=l and mux_to_progressivejsequenc&=0 shall not occur. 
GOP header 

There is no restriction on GOP header (if present) to be the same as that for lower layer 
Picture header 

There is no restriction on picture headers to be the same as in the lower layer. 
Picture coding extension 

The values in this extension can be different fi*om the lower layer except for top_field_first, 
concealment^otion^vectors, and chroma_42Q_type and progressive_frame. The top_field_first shaU be 
based on progressive_sequence and mux_to_j)rogressive_sequence (see sequence_scalable_extension 
above) and concealment__motion_vectors shall be 0. Chroma_420_type shall be identical to the lower 
layer. Progressive_^frame shall always have the same value as progressive.sequence. 

Picture temporal scalable extension 

This extension shall be present for each picture. 
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Quant matrix extension 
This extension may be pr 




7.9.2 




Although temporal predictions can be made from decoded pictures referenced by 
forward_temporal_reference or both forward_temporal_reference and backward_temporal_references, 
temporal scalability is efficient if predictions are formed using decoded picture/pictures from lower layer 
and enhancement layer that are very close in time to the enhancement picture being predicted. It is a 
requironent on the bitstreams that P- pictures and B- pictures shall form predictions from most recent or 
next pictures as illustrated by Tables 7-28 and 7-29. 

In case group_of_pictures_header occurs very often in lowerjayer, ambiguity can occur due to possibility 
of nonuniqueness of temporal references (which are reset at each group_of_pictures_header). This 
ambiguity shall be resolved with help of systems layer timing information. 
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7*10 Data partitioning 

Data partitioning is a technique that splits a video bitstream into two layers, called partitions. A priority 
breakpoint indicates vMch syntax elements are placed in partition 0, which is the base partition (also 
called high priority partition). The remainder of the bitstream is placed in partition 1 (\^ch is also called 
low priority, partition). Sequence, GOP, and picture headers are redundantly copied in partition 1 to 
fkcilitate error recovery. The sequence_end_code is also redundantly copied into partition 1. All fields in 
the redundant headers must be identical to the original ones. The only extensions allowed (and required) 
in partition 1 are sequence_extensionO> picture_coding_exteQsionO and sequence.scalable.extensionQ. 

NOTE - The sliceQ syntax given in 6.2.4 is followed in both partitions up to (an including) the syntax 
elonent extra_biCslice. 

The interpretation of priority_brealq)oint is given in Table 7-30. 



Table 7-30 Priority breakpoint values and associated semantics 



priority_break 
point 


Syntax elements included in partition zero 


0 


This value is reserved for partition 1. All slices in partition 1 shall have a 
priority_breakpoint equal to 0. 


1 


All data at the sequence, GOP, picture and sliceQ down to extra_bit_slice in 
sliceQ. 


2 


All data included above, plus macroblock syntax elements up to and including 
macroblocK_addressJncrement. 


3 


All data included above, plus macroblock syntax elements up to but not including 
coded_block4>attemQ. 


4. ..63 


Reserved. 


64 


All syntax elements up to and including coded_block__pattemQ or DC coefficient 
(dct_dc_diflerential), and the first (run, level) DCT coefficient pair (or EOB).^ 


65 


All syntax elements above, plus up to 2 (run, level) DCT coefficient pairs. 


• ■ • 

63+y 


All syntax elements above, plus to j (run, level) DCT coefficient pairs. 

• 


• * • 

127 


All syntax elements above, plus up to 64 (run, level) DCT coefficient pairs. 



Note that a priQrityJ)reakpoint immediately following the DC coefficient is disallowed since it mi^t cause 
start code emulatioa 
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figure 7-17 A segment from a bitstream with two partitions, with priority_brealq>oint set to 64 (one 
(run, level) pair). The two partitions are shown, with arrows indicating how the decoder needs to 

switch between partitions. 



Semantics of VBV remains unchanged, i.e. the VBV refers to the sum of two partitions, not any single 
one. 

The bitstream parameters bit^rate (bit_rate__value and bit_rate_extension), vbv_buffer_size 
(vbv_buffer_size_value and vbvJbuffer_size_extension) and vbv_delay shall take the same value in the 
two partitions. These parameters refer to the characteristics of the entire bitstream formed from the two 
partitions. 

The decoding process is modified in the following manner: 

Set current _partition to 0, and start decoding from bitstream that contains the 
sequence_scalable_extension (partition 0). 

If current _partition = 0, check to see if the current point in the bitstream is a priority breakpoint. 

If yes, set current _partition to 1 . Next item will be decoded from partition. 1 

Otherwise, continue decoding from partition 0. Remove sequence, GOP, and picture 
headers from both partitions. 

If current _partition = 7, check the priority breakpoint to see if the next item to be decoded is 
expected in partition 0. 

If yes, set current partition to 0. Next item will be decoded from partition 0. 
Otherwise, continue decoding from partition 1. 
An example is shown in Figure 7-17 where the priority breakpoint is set at 64 (one (run, level) pair). 
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7.11 Hybrid scalability 

I^brid scalability is the combination of two diff^ent types of scalability. The types of scalability that can 
be c(nnbmed are SNR scalability, spatial scalability and temporal scalability. When two types of 
scalability are combined, there are three bitstreams that have to be decoded The layers to which these 
bitstreams belong are named in Table 7-31. 



Table 7-31 Names of layers 



layerjd 


name 


0 


base layer 


1 


enhancement layer 1 


2 

• » • 


enhancement layer 2 

• • • 



For the scalability between the enhancement layers 1 and 2, the enhancement layer 1 is its lower layer, 
and the enhancement layer 2 is its enhancement layer. No layer can be omitted from the hierarchical 
laddor. E.g., if there is SNR scalability between enhancement layer 1 and oihancement layer 2, the 
prediction types in enhancement layer 1 are also valid for the combined decoding process for enhancement 
layers 1 and 2. 

The coupling of layers is more loose with spatial and temporal scalability than with SNR scalability. 
Therefore, in these kinds of scalability, first the base layer has to be decoded and upconverted before it can 
be used in the enhancement layer. In SNR scalability, both layers are decoded simultaneously. The 
decoding order can be summarised as follows : 

case 1 : 

base layer 

<spatial or temporal scalability> 
enhancement layer 1 

<SNR scalability> 
enhancement layer 2 

First decode the base layer, and then decode both enhancement layers simultaneously. 
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case2 : 

base layer 

<SNR scalability> 
enhancement layer 1 

<spatial or temporal scalability> 
enhancement layer 2 

First decode the base layer and the enhancement layer 1 simultaneously, and dien decode the 
enhancement layer 2. 

case3 : 

base layer 

<spatial or temporal scaldbility> 
enhancement layer 1 

<spatial or temporal scalability> 
enhancement layer 2 

First decode the base layer, then decode the enhancement layer 1, and finally decode enhanconoit layer 2. 
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7.12 Output of the decoding process 

This secdon describes the ou^ut of the theoretical model of the decoding prcx^ess that decodes bitstreams 
conforming to this specification. 

The decoding process input is one or more coded video bitstreams (one for each of the layers). The video 
layers are generally multiplexed by the means of a system stream that also contains timing information. 

The output of the decoding process is a series of fields or fi-ames that are normally the input of a display 
process. The order in A;^ch fields or firames are output by the decoding process is called the display 
order, and may be different fi-om the coded order (when B-pictures are used). The display process is 
responsible for the action of displaying the decoded fields or fi^mes on a display device. If the display 
device cannot display at the firame rate indicated in the bitstream, the display process may perform fi'ame 
rate conversion. This specification does not describe a theoretical model of display process nor the 
(iteration of the display process. 

Since some of the syntax elements, such as progressive.fiame, may be needed by the display process, in 
this theoretical model of the decoding process, all the syntactic elements that are decoded by the decoding 
process are output by the decoding process and may be accessed by the display process. 

When the a progressive sequoice is decoded (progressive_sequence is equal to 1), the luminance and 
chrominance samples of the reconstructed fi*ames are output by decoding process in the form of 
progressive frames and the output rate is the fi^e rate. Figure 7-18 illustrates this in the case of 
chroma^fonnat equals to 4:2:0. 




frame period 
=l/fi:ame_rate 



Figure 7-18. progressive_sequence = 1 

The same reconstructed frame is output one time if repeat_first_field is equal to 0, and two or three 
consecutive times if repeat_first_field is equal to 1 , depending on the value of top_field_first. Figure 7-19 
illustrates this in the case of chroma_fbrmat equals to 4:2:0 and rq)eat_&'st^eld equals 1 . 
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Figure 7-19. progre$sive_sequeiice = 1, repeat_first_fleld = 1 



When decoding an interlaced sequence (progressive.sequmoe is equal to 0), the luminance samples of 
the reconstructed frames are output by the decoding process in the form of interlaced fields at a rate that is 
twice the frame rate. Figure 7-20 illustrates this. 
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Figure 7-20. progressive^sequence = 0 

It is a requirement on the bitstream that the fields at the output of the decoding process shall always be 
alternately top and bottom (note that the very first field of a sequence may be either top or bottom). 

If the reconstructed frame is interlaced (progressive^frame is equal to 0), the luminance samples and 
chrominance samples are output by the decoding process in the form of two consecutive fields. The first 
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field output by the decoding process is the top field or the bottom field of the reconstructed fi-ame, 
depending on the value of top.field_first. 

Although all the samples of progressive firames represent the same instant in time, all the samples are not 
output at the same time by the decoding process when the sequence is interlaced 

If the reconstructed fiame is progressive (progressive.fi^une is equal to 1), the luminance samples are 
output by the decoding process in the form of two or three consecutive fields, dq>eiiding on the value of 
repeat_first_field. 



NOTE - The information that these fields originate firom the same progressive firune in the bitstream 
is convened to the display process. 

All of the chrominance samples of the reomstructed progressive fi'ame are output by the decoding process 
at the same time as the first field of luminance samples. This is illustrated in Figures 7-21 and 7-22. 
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Figure 7-21. prdgressive.sequence == 0 with 4:2:0 chrominance. 
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Figure 7-22. progressive^sequence = 0 with 4:2:2 or 4:4:4 chrominance. 
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8 Profiles and levels 

NOTE - In this Spedficatioa the word '^profile" is used as defined below. It should not be confused 
with other definitions of "^profile" and in particular it does not have the meaning that is 
defined by JTCl/SGFS. 

Profiles and levels provide a means of defining subsets of the syntax and semantics of this Specification 
and thereby the decoder capabilities required to decode a particular bitstream. A profile is a defined sub- 
set of the entire bitstream syntax that is defined by this Specification. A level is a defined set of 
constraints imposed on parameters in the bitstream. Conformance tests will be carried out against defined 
profiles at defined levels. 

The purpose of defining conformance points in the form of profiles and levels is to faciUtate bitstream 
interchange among different applications. Implementers of this Specification are encouraged to produce 
decoders and bitstreams which correspond to those defined conformance regions. The discretely defined 
profiles and levels are the means of bitstream interchange between applications of this Specification. 

In this clause the constrained parts of the defined profiles and levels are described. All syntactic elements 
and parameter values which are not explicitly constrained may take any of the possible values that are 
allowed by this Specification. In general, a decoder shall be deemed to be conformant to a given profile at 
a given level if it is able to properly decode all allowed values of all syntactic elements as specified by that 
profile at that level. One exception to this rule exists in the case of a Simple profile Main level decoder, 
which must also be able to decode Main profile. Low level bitstreams. A bitstream shall be deemed to be 
conformant if it does not exceed the allowed range of allowed values and does not include disallowed 
syntactic elements. 

Attention is drawn to 5.4 which defines the convention for specifying a range of numbers. This is used 
throughout to specify the range of values and parameters. 

The profile_and_leve]_indication in the sequence_extension indicates the profile and level to which the 
bitstream complies. The meaning of the bits in this parameter is defined in Table 8-1 . 



Table 8-1. Meaning of bits in proflle.andJeveLindication. 



Bits 


Field Size (bits) 


Meaning 


[7:7] 


1 


Escape bit 


[6:4] 


3 


Profile identification 


[3:0] 


4 


Level identification 



Table 8-2 specifies the profile identification codes and Table 8-3 the level identification codes. When the 
escape bit equals zero a profile with a nxmierically larger identification value will be a subset of a profile 
with a numerically smaller identification value. Similarly, whenever the escape bit equals zero, a level 
with a numerically larger identification value will be a subset of a level with a numerically smaller 
identification value. 
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Table 8-2. Profile identification. 



Profile identification 


Profile 


llOtolll 


• 


(reserved) 


101 


Simple 




100. 


Main 




Oil 


SNR Scalable 




010 


Spatially Scalable 




001 


High 




000 




(reserved) 


Table 8-3. Level identification. 


Level identification 


Level 


1011 tollll 




(resoved) 


1010 


Low 




1001 




(reserved) 


1000 


Main 




0111 




(reserved) 


Olio 


High 1440 




0101 




(reserved) 


0100 


High 




0000 to 0011 




(reserved) 



Table 8-4 describes profiles and levels when the escape bit equals 1 . For these profiles and levels there is 
no implied hierarchy fi'om the assignment of profile_and__level_indication and profiles and levels are not 
necessarily subsets of others. 



Table 8-4. Escape profile_and_level_indication identification. 



profile_and_level_indication 


Name 


10000000 to 11111111 


(reserved) 



Attention is drawn to Annex which describes in detail those parts of ISO/IEC 13818-2 that are used for 
a given profile and level. 

8.1 ISO/mC 11172-2 compatibility 

ISO/IEC 1 1 172-2 "constrained parameter" bitstreams shall be decodable by Simple, Main, SNR Scalable, 
Spatially Scalable and High profile decoders at all levels. When a bitstream conforming to ISO/IEC 
1 1 172-2 constrained parameter coding is generated, the constrained_parameters_flag shall be set. 

Additionally Simple, Main, SNR Scalable, Spatially Scalable and High profile decoders shall be able to 
decode D-pictures-only bitstreams of ISO/IEC 11172-2 which are within the level constraints of the 
decoder. 
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8.2 Relationship between defined profiles 

The Simple, Main, SNR Scalable, Spatially Scalable and High profiles have a hierarchical relationship. 
Therefore the syntax supported by a 'higher' profile includes all the syntactic elements of iower' profiles 
(e.g., for a given level, a Main profile decoder shall be able to decode a bitstream conforming to Simple 
profile restrictions). For a given profile, the same syntax set is supported regardless of level. The order of 
hierarchy is given in Table 8-2. 

The syntactic difierences between constraints of profiles are given in Table 8-5. This table describes the 
limits which apply to a bitstream. Note that a Simple Profile conformant decoder must be able to fiilly 
decode both Simple profile. Main level and Main profile. Low level bitstreams. 



Table 8-5. Syntactic constraints of profiles 



Syntactic Element 


Profile 


Simple 


Main 


SNR 


Spatial 


High 


chroma.format 


4:2:0 


4:2:0 


4:2:0 


4:2:0 


4:2:2 or 
4:2:0 


frame_rate_extenslon_n 


0 


0 


0 


0 


0 


frame_rate_extension_d 


0 


0 


0 


0 


0 


aspect_ratio_information 


0001.0010, 
0011 


0001,0010, 
0011 


0001, 0010, 
0011 


0001, 0010, 
0011 


0001, 0010, 
0011 


pictm*e_cod]ngjtype 


I,p 


I,P,B 


I,P,B 


I,P,B 


I,P,B 


repeat_flrst_field 


Constrained 


Uiconstrained 


sequenGe_scalable_extensionO 


No 


No 


Yes 


Yes 


Yes 


scalable^mode 






SNR 


SNRor 
Spatial 


SNRor 
Spatial 


picture_spatial.scalable_extensionO 


No 


No 


No 


Yes 


Yes 


intra_dc_precision 


8, 9, 10 


8, 9, 10 


8, 9, 10 


8, 9, 10 


8, 9, 10,11 


Slice structure 


Restricted 
See 6.1.2.2 



For all defined profiles, there is a semantic restriction on the bitstream that all of the data for a 
macToblock shall be represented with not more than the number of bits indicated by Table 8-6. However, 
a maximum of two macroblocks in each horizontal row of macroblocks may exceed this limitation. 

In this context a macroblock is deemed to start with the first bit of the macroblock_address_increment (or 
macroblock_escape, if any) and continues vmtil the last bit of the "End of block" symbol of the last coded 
block (or the last bit of the codedjblock^4)attem0 if there are no coded blocks)macroblockO syntactic 
structure. The bits required to represent any sliceQ that precedes (or follows) the macroblock are not 
counted as part of the macroblock. 
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Table 8-6. Maximum number of bits in a macroblock 



chroma_format 


Maximum nmnber of bits 


4:2:0 


4608 


4:2:2 


6144 


4:4:4 


9216 



The High profile is also distinguished by having different constraints on luminance sample rate, 
maxiTniiTn bit rate, and VBV buffer size. Refer to tables 8-12, 8-13 and 8-14. 

Decoders that are Simple profile @ Main level compliant shall be capable of decoding Main profile @ 
Low level bitstreams. 

samples/line horizontal_size_jvalue 
lines/fiame vertica]_size_value 
firames/sec frame_rate_value 

8.2.1 Use of repeat_f!rst_fleld 

The use of repeat_first_field in Simple and Main profile bitstreams is constrained as specified in Table 
8-7. 



Table 8-7. Constraints on use of repeat^first.field for Simple and Main Profiles 







repeat_first_field 


frame_rate_code 


firamejate_value 


progressive^ 
sequence — ^0 


progressive. 
sequence=l 


0000 


forbidden 






0001 


24 000^1001 (23,976...) 


0 


0 


0010 


24 


0 


0 


0011 


25 


Oor 1 


0 


0100 


30 000^1001 (29,97...) 


Oor 1 


0 


0101 


30 


Oorl 


0 


0110 


50 


Oorl 


0 


0111 


60 000^1001 (59,94...) 


Oorl 


Oorl 


1000 


60 


Oorl 


Oorl 


• * * 


reserved 






1111 


reserved 







Additional constraints exist for Main profile @ Main level and Simple profile @ Main level only: 

• if (vertical.size > 480 lines) or (fi'ame_rate is '*25Hz") 

then if picture_coding_type = 01 1 (i.e. B-picture), repeat_first_field shall be 0. 

• if vertical_size > 480 lines frame_rate shall be **25Hz" 
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The High profile is also distinguished by having different constraints on luminance sample rate, 
maximum bit rate, and VBV buffer size. Refor to tables 8>1 1, 8-12 and 8-13. 

Decoders that are Simple profile @ Main level compliant shall be capable of decoding Main profile @ 
Low level bitstreams. 

83 Relationship between defmed levels 

The Low, Main, High-1440 and High levels have a hierarchical relationship. Therefore the parameter 
constraints of a *higher' level equal or exceed the constraints of 'lower' levels (e.g., for a given profile, a 
Main level decoder shall be able to decode a bitstream conforming to Low level restrictions). The order of 
hierarchy is given in Table 8-3. 

The different parameter constraints for levels are given in Table 8-8. 



Table 8-8. Parameter constraints for levels 



Syntactic Element 


Level 


Low 


Main 


High-1440 


High 


f_codelO][0] (forward horizontal ) 


[1:7] 


[1:8] 


[1:9] 


[1:9] 


f_code[l][0]*(backward horizontal) 


[1:7] 


[1:8] 


[1:9] 


[1:9] 


Frame picture 










f_code[0]ll] (forward vertical) 


[1:4] 


[1:5] 


[1:5] 


[1:5] 


f_code(l][l]*(backward vertical ) 


[1:4] 


[1:5] 


[1:5] 


[1:5] 


vertical vector range t 


[-64:63,5] 


[-128:127,5] 


[-128:127,5] 


[-128:127,5] 


Field picture 










f_code[0]|ll (forward vertical) 


[1:3] 


[1:4] 


[1:4] 


[1:41 


f_code[l][l]*(backward vertical ) 


[1:3] 


[1:4] 


[1:4] 


[1:4] 


vertical vector ranget 


[-32:31,5] 


[-64:63,5] 


[-64:63,5] 


[-64:63,5] 


frame_rate_code 


[1:5] 


[1:5] 


[1:8] 


[1:8] 


Sample Density 


See Table 8-11 


Luminance Sample Rate 


See Table 8-12 


Maximum Bit Rate 


See Table 8-13 


Buffer Size 


SeeTaUe 8-14 


* For Simple profile bitstreams ^ich do not include B-pictures, f_code[l][0] and 
f_code[l][l] shall be set to 15 (not used). 

t This restriction applies to the final reconstructed motion vector. In the case of 
dual prime motion vectors it applies before scaling is performed, afier scaling is 
performed and after the small differential motion vector has been added. 



8.4 Scalable layers 

The SNR Scalable, Spatial Scalable and I£gh profiles may use more than one bitstream to code the image. 
These different bitstreams represent layers of coding, A^ich when combined create a higher quality image 
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than that obtainable from one layer alone (see annex D). The maximinn number of layers for a given 
profile is specified in table 8-9. The scalable layers are named according to Table 7-31. The syntactic and 
parameter constraints for these profile / level combinations ^en coded using the maximum permitted 
number of layers are given in tables 8-11, 8-12, 8-13 and 8-14. When the number of layers is less than 
the maximum permitted, reference should also be made to tables £-2 1 to £-46 as appropriate. 

It diould be noted that the base layer of an SNR Scalable profile bitstream can always be decoded by a 
Main profile decoder of equivalent level. Conversely, a Main profile bitstream shall be decodable by an 
SNR profile decoder of equivalent level. 



Table 8-9. Upper bounds for scalable layers in SNR Scalable, Spatially Scalable 

and EUgh profiles 







Profile 


Level 


Maximum Number of 


SNR 


Spatial 


High 


High 


All layers (base + enh.) 
Spatial enhancement layers 
SNR enhancement layers 






3 


Higb-1440 


All layers (base + enh.) 
Spatial enhancement layers 
SNR enhancement layers 




3 
1 
1 




Main 


All layers (base + enh.) 
Spatial enhancement layers 
SNR enhancement layers 


2 
0 
1 






Low 


All layers (base + enh.) 
Spatial enhancement layers 
SNR enhancement layers 


2 
0 
1 







8.4.1 Permissible layer combinations 

Table 8-10 is a summary of the permitted combinations, and is subject to the following rules: 

° SNR Scalable profile - maximum of 2 layers; Spatially Scalable & High profile - maximum of 3 
layers. (See Table 8-9) 

° Only one SNR and one Spatial scale allowed in 3-layer combinations, either SNR/Spatial or 
Spatial/SNR order is permitted. (See Table 8-9) 

° Adding 4:2:2 du'oma format to a 4:2:0 lower layer is considered an SNRpermitted for either SNR 
or Spatial scale. 

° A 4:2:0 layer is not permitted if die lower layer is 4:2:2. (See 7.7.3.3) 
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Table 8-10. Permissible layer combinations 











Profile / level 




* 


Scalable mode 




of simplest base layer 
decoder 


Profile 


Base layer 


Enh. layer 1 


Enh. layer 2 


(level ref* top layer) * 


SNR 


4:2:0 


SNR, 4:2:0 


- 


MP@same level 


Spatial 


4:2:0 


SNR, 4:2:0 




MP@same level 


Spatial 


4:2:0 


Spatial, 4:2:0 




MP@Oevel - 1) 


Spatial 


4:2:0 


SNR, 4:2:0 


Spatial, 4:2:0 


MP@(level - 1) 


Spatial 


4:2:0 


Spatial, 4:2:0 


SNTl, 4:2:0 


NfP@Oevel - i) 


High 


4:2:0 






HP@same level 


High 


4:2:2 


_ 




HP@same level 


High 


4:2:0 


SNR, 4:2:0 




HP@same level 


High 


4:2:0 


SNR, 4:2:2 




HP@same level 


High 


4:2:2 


SNR, 4:2:2 




HP@same level 


High 


4:2:0 


Spatial, 4:2:0 




HP@(level - 1) 


High 


4:2:0 


Spatial, 4:2:2 




HP@Gevel - 1) 


High 


4:2:2 


Spatial, 4:2:2 




HP@Oevel - 1) t 


High 


4:2:0 


SNR, 4:2:0 


Spatial, 4:2:0 


HP@Ocvel - 1) 


High 


4:2:0 


SNR, 4:2:0 


Spatial, 4:2:2 


HP@(level - 1) 


High 


4:2:0 


SNR, 4:2:2 


Spatial, 4:2:2 


HP@Oevel - 1) t 


High 


4:2:2 


SNR, 4:2:2 


Spatial, 4:2:2 


HP@(level - 1) t 


High 


4:2:0 


Spatial, 4:2:0 


SNR, 4:2:0 


HP@(level - 1) 


High 


4:2:0 


Spatial, 4:2:0 


SNR, 4:2:2 


HP@(level - 1) 


High 


4:2:0 


Spatial, 4:2:2 


SNR, 4:2:2 


HP@Gevel - 1) 


High 


4:2:2 


Spatial, 4:2:2 


SNR, 4:2:2 


HP@Oevel - 1) t 



* The simplest ccmipliant decoder to decode the base layer is specified, assuming that bitstream may 
contain any syntax and parametCT value permitted for the stated profile @ level, except scalability. Note 
that for High profile @ Main level spatially scaled bitstreams, 'HP@(level - 1)' becomes 'MP@Oevel - 
1)'. In the event that a base layer bitstream uses fewer syntactic elements or a reduced parameter range 
than permitted, profile.andJeveLindication may indicate a esimpleri profile @ level. 

t Note that 4:2:2 chroma format is not supported as a lower spatial layer of High profile @ Main level 
(see Table 8-12). 

D^ils of the difierent the parameter limits that may be applied in each layer of a bitstream and the 
corresponding appropriate profile_and_level_indication that should be used are given in Annex £, 
Tables E-20 to E-45 
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8.5 Parameter values for defined profiles, levels and layers 



Table 8-11. Table 8-10. Upper boimds for sampling density 





Spatial 






Profile 


















resolution 










Level 


layer 




Simple 


Main 


SNR 


Spatial 


High 


High 


Enhancement 


samples/line 




1920 






1920 






lines/frame 




1152 






1152 






frames/sec 




60 






60 




Lower 


samples/line 










960 






lines/frame 










576 






frames/sec 










30 


High- 


Enhancement 


samples/line 




1440 




1440 


1440 


1440 




lines/fi^me 




1152 




1152 


1152 




• 


frames/sec 




60 




60 


60 




Lower 


samples/line 








720 


720 






lines/frame 








576 


576 






firames/sec 








30 


30 


Main 


Enhancement 


samples/line 


720 


720 


720 




720 






lines/frame 


576 


576 


576 




576 






fiames/sec 


30 


30 


30 




30 




Lower 


samples/line 










352 






lines/frame 










288 






frames/sec 










30 


Low 


Enhancement 


samples/line 




352 


352 










lines/frame 




288 


288 










fi-ames/sec 




30 


30 








Lower 


samples/line 
















lines/frame 
















firames/sec 












NOTE- 


In the case of single layer or SNR scaled coding, the limits specified by 'Enhancement layer' apply 



The syntactic elements referoiced by this table are as follows: 



samples/line horizontal_size 
lines/frame vertical_size 
frames/sec : frame.rate 

The i:q)per bound for frame_rate is the same for both progressive_sequence = 0 and 
progressive^sequence 1 . 
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Table 8-12. Upper bounds for limiiiiance sample rate (samples/sec) 



Level 


Spatial 
resolution 
layer 


Profile 


Simple 


Main 


SNR 


Spatial 


High 


High 


Enhancement 




62 668 800 






62 668 800 (4:2:2) 
83 558 400 (4:2:0) 


Lower 










14 745 600 (4:2:2) 
19 660 800 (4:2:0) 


High-1440 


Enhancement 




47 001 600 




47 001 600 


47 001 600 (4:2:2) 
62 668 800 (4:2:0) 


Lower 








10 368 000 


1 1 059 200 (4:2:2) 
14 745 600 (4:2:0) 


Main 


Enhancement 


10 368 000 


10 368 000 


10 368 000 




1 1 059 200 (4:2:2) 
14 745 600 (4:2:0) 


Lower 










3 041 280 (4:2:0) 


Low 


Enhancement 




3 041 280 


3 041 280 






Lower 












NOTE - In the case of single layer or SNR scaled coding, the limits specified by 'Enhancement layer' apply 



The luminance sample rate P is defined as follows: 
For progressive_sequence = 1 



P = (16 ♦ ((horizontaLsize + 15) / 16)) x (16 ♦ ((vertical^size + 15) / 16)) x frame_rate 

For progressive__sequence = 0 

P = (16 * ((horizontaLsize + 15) / 16)) x (32 ♦ ((vertical.size + 31) / 32)) x frame.rate 
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Table 8-13. Upper bounds for bit rates (Mbit/s) 



Level 


Profile 


Simple 


Main 


SNR 


Spatial 


High 


High 

4 




80 






100 all layers 
80 middle + base layer 
25 base layer 


High-1440 




60 




60 all layers 
40 middle + base layers 

1 5 base layer 


80 all layers 
60 middle + base layers 

20 base layer 


Main 


15 


15 


15 both layers 
10 base layer 




20 all layers 
15 middle + base layer 

4 base layer 


Low 




4 


4 both layers 
3 base layer 







NOTES - 



1 This table defines the maximmn rate of operation of the VBV for a coded bitstream of the 
given profile and level. This rate is indicated by bit.rate see 6.3.3. 

2 This table defines the maximmn permissible data rate for all layers up to and including the 
stated layer. For multi-layer coding applications, the data rate apportioned between layers is 
constrained only hy the maximum rate permitted for a given layer as stated in this table. 

3 1 Mbit = 1 000 000 bits 



Table 8-14. VBV Buffer size requirements (bits) 



Level 


Layer 


Profile 


Simple 


Main 


SNR 


Spatial 


High 


High 


Enh.2 
Enh. 1 

Base 




9 781 248 






12 222 464 
9 781 248 
3 047 424 


High-1440 


Enh.2 
Enh. 1 

Base 




7 340 032 




7 340 032 
4 882 432 

1 835 008 


9 781 248 
7 340 032 

2441 216 


Main 


Enh.2 
Enh. 1 
Base 


1 835 008 


1 835 008 


1 835 008 
1 212 416 




2441216 
1 835 008 
475 136 


Low 


Enh.2 
Enh. 1 
Base 




475 136 


475 136 
360448 







NOTES - 
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1 The bufier size is calculated to be proportional to the maximum allowable bit rate, rounded 
down to the nearest multiple of 16 x 1024 bits. The reference value for scaling is the Main 
profile. Main level buffer size. 

2 This table defines the total decoder buffer size required to decode all layers iq> to and 
including the stated layer. For multi-layer coding applications, the aUocation of buffor 
memory between layers is constrained only by the maximum size permitted for a given layer 
as stated in this table. 

3 The syntactic element corresponding to this table is vby_biifrer_size (see 6.3.3). 



Table 8-15. Forward compatibility between different profiles and levels 



Profile & Level 
indication in 
bitstream 


Decoder 


HP 

@ 
HL 


HP 

@ 
H-14 


HP 

@ 
ML 


Spatial 

@ 
H-14 


SNR 

@ 
ML 


SNR 

@ 
LL 


MP 

@ 
HL 


MP 

@ 
H-14 


MP 

@ 
ML 


MP 

@ 
LL 


SP 

@ 
ML 


HP@HL 


X 






















HP@H-14 


X 


X 




















HP@ML 


X 


X 


X 


















Spatial@H-14 


X 


X 




X 
















SNR@ML 


X 


X 


X 


X 


X 














SNR@LL 


X 


X 


X 


X 


X 


X 












MP@HL 


X 












X 










MP@H-14 


X 


X 




X 






X 


X 








MP@ML 


X 


X 


X 


X 


X 




X 


X 


X 






MP@LL 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X* 


SP@ML 


X 


X 


X 


X 


X 




X 


X 


X 




X 


ISO/IEC 11172 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X 


X indicates the decoder shall be able to decode the bitstream including all relevant lower layers. 
* Note that SP@ML decoders are required to decode MP@LL bitstreams. 



NOTE - For Profiles and Levels which obey a hierarchical structure, it is recommended that each 
layer of the bitstream should contain the profile_and_leveI_indication of the "simplesf * 
decoder which is capable of successfully decoding that layer of the bitstream. In the case 
where the proiile^andJeveLindication Escape bit = 0, this will be the numerically largest 
of the possible valid values of profile_and_level_indication. 
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Annex A 
cosine transform 

■ 

(This annex forms an integral part of this Recommendation | International Standard) 
The NxN two dimensional DCT is defined as: 

^Uv) = — C(w)C(v)2, Z.f{x,y)oos — — cos — — 

N x=o 2N 2N 

with u, V, x,y = 0, 1,2, ...N-1 
where x, y are spatial coordinates in the sample domain 
u, V are coordinates in the transform domain 

^/ \ X-./ N ["4=^ forw,v=0 

[ 1 otherwise 

The invCTse DCT QDCT) is defined as: 

f(x,y) = — 2, 2^C{u)C(v)F{u,v)cos — — cos ^ ^ / 

The input to the forward transform and ouQiut &om the inverse transform is represented with 9 bits. The 
cdefiBcients are represdited in 12 bits. The dynamic range of the DCT coefficients is [-2048:+2047]. 

The N by N inverse discrete transform shall conform to IEEE Standard Specification for the 
Implementations of 8 by 8 Inverse Discrete Cosine Transform, Std 1 1 80* 1 990, December 6, 1 990. 

NOTES - 

1 Clause 23 Std 1180-1990 "Considerations of Specifying IDCT Mismatch Errors" requires 
the specification of periodic intra-picture coding in order to control the accumulation of 
mismatch errors. Every macroblock is required to be refireshed before it is coded 132 times as 
predictive macroblocks. Macroblocks in B-pictures (and skipped macroblocks in P-pictures) 
are excluded fi'om the counting because they do not lead to the accumulation of mismatch 
errors. This requirement is the same as indicated in 1180-1990 for visual telephony 
accordmg to ITU-T Recommendation H.261 . 

2 Whilst the IEEE IDCT standard mentioned above is a necessary condition for the 
satisfactory implementation of the IDCT function it should be understood that this is not 
sufiicient. In particular attention is drawn to the following sentence from 5.4 of this 
specification: ^'Where arithmetic precision is not specified, such as the calculation 
of the IDCT, the precision shall be sufficient so that significant errors do not occur in the 
final integer values." 
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Annex B 
Variable length code tables 



B.1 



(This annex forms an integral part of this Reconunendation | International Standard) 

Macroblock addressing 



Table B-1 — Variable length codes for macroblock_address_increment 



niacroblock_address_ 
increment VLC code 



increment value 



macroblock_address. 
increment VLC code 



increment value 



1 

Oil 
010 
0011 
0010 
0001 1 
00010 
0000111 
0000110 
0000 101 1 
0000 1010 
0000 1001 
0000 1000 
0000 0111 
0000 0110 
0000 0101 11 
0000 0101 10 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 



0000 0101 01 
0000 0101 00 
0000 010011 
0000 0100 10 
0000 0100 011 
0000 0100 010 
0000 0100 001 
0000 0100000 
0000 0011 111 
0000 0011 110 
0000 0011 101 
0000 0011 100 
0000 0011 oil 
0000 0011010 
0000 0011001 
0000 0011 000 
0001 000 



IIIII 



18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 

macroblockjescape 



NOTE - The '^macroblock stufBug** entry that is available in ISO/IECl 1 172-2 is not available in this 
specification. 
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B.2 Macroblock type 

The properties of the macroblock are detomined by the macroblodc type VLC according to these tables. 



Table B-2 — Variable length codes for macroblock^type in I-pictures 



macroblock_type VLC code 




macroblocK_quant 






macrobIock_niotion_forward 








macroblock.motion.backward 










macroblock^pattern 












macroblockjntra 














spatial_temporal_weight_code_flag 
















permitted spati al_temporal_weight_classes 
















Description 




1 


0 


0 


0 


0 


1 


0 


Intra 


0 


01 


1 


0 


0 


0 


1 


0 


Intra, Quant 


0 



Table B-3 — Variable length codes for macroblockjtype in P-pictures 



macroblock_type VLC code 





macroblock^quant 






macrobIock_motion_forward 








niacroblock_motion_backward 










macroblock_pattem 












macroblockjntra 














spatial_temporal_weight_code_flag 
















permitted spatia]_temporal_weight_classes 
















Description 




1 


0 


1 


0 


1 


0 


0 


MC, Coded 


0 


01 


0 


0 


0 


1 


0 


0 


No MC, Coded 


0 


001 


0 


1 


0 


0 


0 


0 


MC, Not Coded 


0 


0001 1 


0 


0 


0 


0 


1 


0 


Intra 


0 


00010 


1 


1 


0 


1 


0 


0 


MC, Coded, Quant 


0 


00001 


1 


0 


0 


1 


0 


0 


No MC, Coded, Quant 


0 


000001 


1 


0 


0 


0 


1 


0 


Intra, Quant 


0 
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Table B-4 — Variable length codes for macroblock.type in B-pictures 



macroblock.type VLC code 




macroblocK^quant 






niacroblock_motion_forward 








macroblo€k_motion_backward 










macroblock_pattern 












macroblock.intra 














spatial_tempora]_weight_code_flag 
















permitted spatial_teniporal_weight_cIasses 
















Description 




10 


0 


1 


1 


0 


0 


0 


Interp, Not Coded 


.0 


11 


0 


1 


1 


1 


0 


0 


Interp, Coded 


0 


010 


0 


0 


1 


0 


0 


0 


Bwd, Not Coded 


0 


oil 


0 


0 


1 


1 


0 


0 


Bwd, Coded 


0 


0010 


0 


1 


0 


0 


0 


0 


Fwd, Not Coded 


0 


0011 


0 


1 


0 


1 


0 


0 


Fwd, Coded 


0 


0001 1 


0 


0 


0 


0 


1 


0 


Intra 


0 


0001 0 


1 


1 


1 


1 


0 


0 


Interp, Coded, Quant 


0 


0000 11 


1 


1 


0 


1 


0 


0 


Fwd, Coded, Quant 


0 


0000 10 


1 


0 


1 


1 


0 


0 


Bwd, Coded, Quant 


0 


0000 01 


1 


0 


0 


0 


1 


0 


Intra, Quant 


0 



Table B-5 — Variable length codes for macroblock_type in I-pictures with spatial scalability. 



macroblock_type VLC code 




macroblock.quant 






macroblocK.motion_forward 








macroblock_motion_backward 










macroblocl^attem 












macroblock_intra 














spatial_temporal_weight_code_flag 
















permitted spatial_temporaljweight_classes 
















Description 




1 


0 


0 


0 


1 


0 


0 


Coded, Compatible 


4 


01 


1 


0 


0 


1 


0 


0 


Coded, Compatible, Quant 


4 


0011 


0 


0 


0 


0 


1 


0 


Intra 


0 


0010 


1 


0 


0 


0 


1 


0 


Intra, Quant 


0 


0001 


0 


0 


0 


0 


0 


0 


Not Coded, Compatible 


4 
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Table B-6 — Variable length codes for macroblock_type in P-pictures with spatial scalability. 



macroblock_type VLC code 




macroblock_qiiant 






niacrob!ock_motion_forward 


■ 






macroblock_motion_backward 










macrobIock_pattern 












macroblock.intra 














spatial_teniporal_weight.code_l]ag 
















permitted spatial_temporal_welght_classes 
















Description 




10 
Oil 


0 
0 


1 

1 


0 
0 


1 
1 


0 
0 


0 
1 


MC. Coded 


0 
1,2,3 


# 

MC, Coded, Compatible 


0000 100 


0 


0 


0 


1 


0 


0 


No MC, Coded 


0 


0001 11 


0 


0 


0 


1 


0 


1 


No MC, Coded, Compatible 


1.2,3 


0010 


0 


1 


0 


0 


0 


0 


MC, Not Coded 


0 


AAAfl 111 

UUuU 111 


0 


0 


0 


0 


1 


0 


Intra 


0 


0011 


0 


1 


0 


0 


0 


1 


MC, Not coded. Compatible 


1,2,3 


010 




1 


0 


1 


0 


0 


MC, Coded, Quant 


0 


0001 00 




0 


0 


1 


0 


0 


No MC, Coded, Quant 


0 


0000110 




0 


0 


0 


1 


0 


Intra, Quant 


0 


11 




1 


0 


1 


0 


1 


MC, Coded, Compatible, Quant 


1,2,3 


0001 01 




0 


0 


1 


0 


1 


No MC, Coded, Compatible,Quant 


1,2,3 


0001 10 


0 


0 


0 


0 


0 


1 


No MC, Not Coded, Compatible 


1,2,3 


0000101 


0 


0 


0 


1 


0 


0 


Coded, Compatible 


4 








0000 010 


1 


0 


0 


1 


0 


0 


Coded, Compatible, Quant 


4 


0000 011 


0 


0 


0 


0 


0 


0 


Not Coded, Compatible 


4 
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Table B-7 — Variable length codes for macrobIO€k_type in B-pictnres with spatial scalability. 



macroblock.type VLC code 



macroblock.quant 



inacroblock_motioii_forward 



macroblock motion backward 



inacroblock_j)attem 



macroblock.intra 



spatial_teinpora|_weight_code_i]ag 



permitted spatiaI_temporaLweight_classes 



Description 



10 
11 
010 

oil 

0010 
0011 
0001 10 
0001 11 
000100 
0001 01 
0000 110 
0000 111 
0000 100 
0000 101 
0000 0100 
0000 0101 
0000 01100 

0000 OHIO 

0000 01101 
0000 0111 1 



0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 



0 

1 

0 



1 
1 

0 
0 

1 
1 

0 
0 

1 
1 

0 

1 
1 

0 
0 

1 

0 
0 
0 
0 



1 
1 
1 
1 

0 
0 

1 
1 

0 
0 
0 

1 

0 

1 

0 
0 

1 

0 
0 
0 



0 

1 

0 

1 

0 

1 

0 

1 

0 

1 

0 

1 
1 
1 

0 

1 
1 

0 

1 
1 



0 
0 
0 
0 
0 
0 
0 
0 
0 
0 

1 

0 
0 
0 

1 

0 
0 
0 
0 
0 



0 
0 
0 
0 
0 
0 

1 
1 
1 
1 

0 
0 
0 
0 
0 

1 
1 

0 
0 
0 



Interp, Not coded 

Inteip, Coded 
Back, Not coded 
Back, Coded 
For, Not coded 
For, Coded 
Back, Not Coded, Compatible 

Back, Coded, Compatible 
For, Not Coded, Compatible 
For, Coded, Compatible 
Intra 

Interp, Coded, Quant 
For, Coded, Quant 
Back, Coded, Quant 
Intra, Quant 
For, Coded, Compatible, Quant 
Back, Coded, Compatible, Quant 
Not Coded, Compatible 
Coded, Compatible, Quant 
Coded, Compatible 



0 

0 

0 

0 

0 

0 
1.2,3 
1,2.3 
1,2,3 
1,2,3 

0 

0 

0 

0 

0 
1,2,3 
1,2,3 

4 

4 

4 
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Table BS — Variable length codes for macroblock_type in I-pictures, P^ictures and B-pictures 

with SNR scalability. 



macrobloclctype VLC code 




macroblo€k_quant 






niacroblock_motion_forward 








macroblock_motion_backward 










macroblock_pattem 












macroblockjntra 














spatial.temporal.weight_code_flag 
















permitted spatialjtemporal_weight_classes 
















Description 




1 


0 


0 


0 


1 


0 


0 


Coded 


0 


01 


1 


0 


0 


1 


0 


0 


Coded, Quant 


0 


001 


0 


0 


0 


0 


0 


0 


Not Coded 


0 



NOTE - There is no differentiation between picture types, since macroblocks are processed identically 
in I, P and B-pictures. The '"Not coded" type is needed, since skipped macroblocks are not 
allowed at beginning and end of a slice. 
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B3 



Macroblock pattern 



Table B-9 — Variable length codes for coded.blockjpattern. 



coded_block_patteni 
VLCcode 



111 
1101 
1100 
1011 
1010 
1001 1 
1001 0 
1000 1 
1000 0 
0111 1 
0111 0 
01101 
01100 
0101 1 
0101 0 
01001 
0100 0 
0011 11 
0011 10 
0011 01 
0011 00 
0010 111 
0010110 
0010101 
0010 100 
0010011 
0010 010 
0010 001 
0010 000 

0001 nil 

0001 1110 
0001 1101 



cbp 



60 

4 

8 

16 

32 

12 

48 

20 

40 

28 

44 

52 

56 

1 

61 
2 
62 
24 

36 
3 

63 

5 

9 

17 

33 

6 

10 

18 

34 

7 

11 

19 



coded_block_j)attem 
VLCcode 



0001 1100 
0001 1011 
0001 1010 
0001 1001 
0001 1000 
0001 0111 
0001 0110 
0001 0101 
0001 0100 
0001 0011 
0001 0010 
0001 0001 
0001 0000 
0000 1111 
0000 1110 
0000 1101 
00001100 
0000 1011 
0000 1010 
0000 1001 
0000 1000 
0000 0111 
0000 0110 



0( 
0( 
0( 
0( 



OOlOl 
0 0100 
OOOll 1 
9 0011 0 
0000 0010 1 
0000 0010 0 
0000 0001 1 
0000 0001 0 
0000 00001 



cbp 



35 

13 

49 

21 

41 

14 

50 

22 

42 

15 

51 

23 

43 

25 

37 

26 

38 

29 

45 

53 

57 

30 

46 

54 

58 

31 

47 

55 

59 

27 

39 

0(NOTE) 



NOTE — This entry shall not be used with 4:2:0 chrominance structure 
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B.4 . Motion vectors 



Table B-10 — Variable length codes for motion.code 



Variable length code 


motion codelrlfslft] 


0000 0011 001 


-16 


0000 0011 oil 


-15 


0000 001 1 101 


-14 


0000 0011 111 


-13 


0000 0100 001 


-12 


0000 0100 011 


-11 


0000 0100 11 


-10 


0000 0101 01 


-9 


0000 0101 11 


-8 


0000 0111 


-7 


0000 1001 


-6 


0000 1011 


-5 


0000 1 1 1 


-4 




-3 


0011 


-2 


oil 


-1 


1 


0 


010 


1 


0010 


2 


0001 0 


3 


0000110 


• 

4 


0000 1010 


5 


0000 1000 


6 


0000 0110 


1 


0000 0101 10 


8 


0000 0101 00 


9 


0000 0100 10 


10 


0000 0100 010 


11 


0000 0100 000 


12 


0000 0011 110 


13 


0000 001 1 100 


14 


0000 0011 010 


15 


0000 001 1 000 


16 
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Table B-11 — Variable length codes for dinvector[t] 



code 


value 


11 


-1 


0 


0 


10 


1 



B.5 DCT coefficients 



Table B-12 — Variable length codes for dct_dc_size_liuiiinaiice 



Variable length code 


dct_dc_size_luminance 


100 


0 


00 


1 


01 


2 


101 


3 


110 


4 


1110 


5 


nil 0 


6 


nil 10 


7 


nil no 


8 


nil 1110 


9 


nil nil 0 


10 


nil nil 1 


11 


Table B-13 — Variable length codes for dct_dc_size_chro]iiinance 


Variable length code 


dct_dc_size_chrominance 


00 


0 


01 


1 


10 


2 


no 


3 


1110 


4 


lino 


5 


nil 10 


6 


nil no 


7 


nil 1110 


8 


nil lino 


9 


nil nil 10 


10 


nil nil 11 


11 
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Table B-14 — DCT coefficients Table zero 



Variable length code (NOTEl) 



run 



level 



10 (NOTE 2) 
1 s(N0TE3) 

11 s (NOTE 4) 
Oils 

0100 s 

0101 s 
00101s 
0011 1 s 
0011 Os 
0001 10 s 
0001 lis 
0001 01 s 
0001 00 s 
0000 110 s 
0000 100 s 
0000 111 s 
0000 101 s 
0000 01 
0010 0110 s 
0010 0001 s 
00100101s 
0010 0100 s 
00100111 s 
0010 0011s 
0010 0010 s 

0010 0000 s 
0000 001010 s 
0000 0011 00 s 
0000 001011s 

00000011 lis 
0000 0010 01 s 
0000 0011 10 s 
0000 001101s 
0000 0010 00 s 



End of Block 

0 

0 

1 

0 

2 

0 

3 

4 

1 

5 

6 

7 

0 

2 

8 

9 

Escape 

0 

0 

1 

3 

10 

11 

12 

13 

0 

1 

2 

4 

5 

14 

15 

16 



1 
1 
1 
2 
1 
3 
1 
1 
2 
1 
1 
1 
4 
2 
1 
1 

5 
6 
3 
2 
1 
1 
1 
1 
7 
4 
3 
2 
2 
1 
1 
1 



NOTEl - The last bit *s' denotes the sign of the level, '0* for positive * T for negative. 
N0TE2 - ''End of Block" shall not be the only code of the block. 
N0TE3 - This code shall be used for the first (DC) coefiGdent in the block 
N0TE4 - This code shall be used for all other coe£5cients 
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Table B-14 — DCT coefficients Table zero (continued) 



Variable length code (NOTE) 



run 



level 



0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0001 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 



1101 s 

1000 s 
0011 s 

0000 s 
1011 s 

0100 s 
1100s 
0010 s 
1110s 

0101 s 

0001 s 
1111 s 

1010 s 

1001 s 
0111 s 
0110 s 
1101 Os 
11001 s 

1100 0 s 

1011 1 s 
10110 s 
1010 1 s 
1010 0 s 
1001 1 s 
1001 0 s 
1000 1 s 
1000 0 s 
nil 1 s 
1111 Os 
1110 1s 
11100s 

1101 1 s 



0 
0 
0 
0 

1 

2 

3 

4 

6 

7 

8 

17 

18 

19 

20 

21 

0 

0 

0 

0 

1 
1 

2 

3 

5 

9 

10 

22 

23 

24 

25 

26 



8 

9 

10 

11 

5 

4 

3 

3 

2 

2 

2 

1 

1 

1 

1 

1 

12 

13 

14 

15 

6 

7 

5 

4 

3 

2 

2 



NOTE - The last bit 's' denotes the sign of the level, '0' for positive, ' T for n^ative. 
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Table B-14 — DCT coefHcients Table zero (contmoed) 



Variable length code (NOTE) 



run 



level 



0000 0000 0111 11 s 
0000 0000 0111 10 s 
0000 0000 01 1 1 01 s 
0000 0000 011100 s 
0000 0000 0110 lis 
0000 0000 0110 10 s 
0000 0000 0110 01 s 
0000 0000 0110 00 s 
0000 0000 0101 11 s 
0000 0000 0101 10 s 
0000 0000 0101 01 s 
0000 0000 0101 00 s 
0000 0000 010011 s 
0000 0000 0100 10 s 
0000 0000 0100 01s 
0000 0000 0100 00 s 
0000 0000 0011 000 s 
0000 0000 0010 111s 
0000 0000 0010110 s 
0000 0000 0010 101 s 
0000 0000 0010 100 s 
0000 0000 0010 011 s 
0000 0000 0010 010 s 
0000 0000 0010001 s 
0000 0000 0010 000 s 
0000 0000 0011 111 s 
0000 0000 0011 110 s 
0000 0000 0011 101 s 
0000 0000 0011 100 s 
0000 0000 0011 oil s 
0000 0000 0011 010 s 
0000 0000 0011 001 s 



0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 



16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

8 

9 

10 

11 

12 

13 

14 



NOTE - The last hit 's' denotes the sign of the level, '0' for positive, T for negative. 
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Table B-14 — DCT coefficients Table zero (concluded) 



Variable length code (NOTE) 



run 



level 



0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 
0000 0000 



0001 001 1 s 
0001 0010 s 
0001 0001 s 
0001 0000 s 
0001 0100 s 
0001 1010 s 
0001 1001 s 
0001 1000 s 
0001 0111 s 
0001 0110 s 
0001 0101 s 

0001 nil s 

0001 1110 s 
0001 1101 s 
0001 1100s 
0001 101 1 s 



1 
1 
1 
1 

6 

11 

12 
13 
14 
15 
16 
27 
28 
29 
30 
31 



15 

16 

17 

18 

3 

2 

2 

2 

2 

2 

2 



NOTE - The last bit 's' denotes the sign of the level» '0' for positive, ' 1' for negative. 
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Table B-15 — DCT coefficients Table one 



Variable length code (NOTEl) 


run 


level 


01 10 N0TE2 


End of Block 




10s 


0 


1 


010 s 


1 


1 


110s 


0 


2 


0010 1 s 


2 


1 


0111s 


0 


3 


0011 1 s 


3 


1 


0001 10 s 


4 


1 


00110 s 


1 


2 


0001 11 s 


5 


1 


0000110 s 


6 


1 


0000 100 s 


7 


1 


11100s 


0 


4 


0000111 s 


2 


2 


0000 101 s 


8 


1 


1111 000s 


9 


1 


000001 


Escape 




11101 s 


0 


5 


0001 01 s 


0 


6 


1111001s 


1 


3 


0010 0110 s 


3 


2 


nil 010s 


10 


1 


0010 0001 s 


11 
12 


1 
1 


0010 0101 s 


0010 0100 s 


13 


1 


000100 s 


0 


7 


0010 0111 s 


1 


4 


1111 1100s 


2 


3 


nil 1101 s 


4 


2 


0000 0010 0 s 


5 


2 


0000 00101 s 


14 


1 


0000 0011 1 s 


15 


1 


0000 001101 s 

■ 


16 


1 


NOTE 1 - The last bit 's' denotes the sign of the level, '0' for positive ' 1' for negative. 


NOTE 2 - "End of Block" shall not occur as the only code of a block. 
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Table B-15 — DCT coefficients Table one (continued) 



Variable lengtb code (NOTE) 


run 


level 


1111 Oil s 


0 


8 


nil loos 


0 


9 


0010 0011 s 


0 


10 


0010 0010 s 


0 


11 


0010 0000 s 


1 


5 


oono 001 1 00 <5 

\J\I\J\J VVl 1 V/V d 




4 


0000 0001 1100 s 


3 


3 


0000 0001 0010 s 


4 


3 


0000 0001 1110s 


6 


2 


0000 0001 0101s 


7 


2 


0000 0001 0001 s 


8 


2 


0000 0001 1111 s 


17 


1 


0000 0001 1010 s 


18 


1 
1 


0000 0001 1001 s 


19 


0000 0001 0111 s 


20 


1 


0000 0001 0110 s 


21 


1 


nil 1010s 


0 


12 


nil 1011s 


0 


13 


iin 1110s 


0 


14 


nil nil s 


0 


15 


0000 0000 1011 Os 


1 


6 


0000 0000 1010 1 Q 

\f\/%J\J Wvvr li/lv 1 2k 


1 
1 


7 


0000 0000 1010 0 s 


2 


5 


0000 0000 1001 1 s 


3 


4 


0000 0000 1001 0 s 


5 


3 


0000 0000 1000 1 s 


9 


2 


0000 0000 1000 0 s 


10 


2 


0000 0000 1111 Is 


22 




0000 0000 11110 s 


23 




0000 000011101 s 


24 




0000 0000 11 10 0 s 


25 




0000 00001101 Is 


26 




NOTE - The last bit 's' denotes the sign of the level, *0' for positive, T for negative. 
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Table B-15 — DCT coefiidents Table one (continued) 



Variable lengtb code (NOTE) 



run 



level 



0000 0000 0111 11 s 
0000 0000 0111 10 s 
0000 0000 0111 01 s 
0000 0000 0111 00 s 
0000 0000 011011 s 
0000 0000 0110 10 s 
0000 0000 01 10 01 s 
0000 0000 0110 00 s 
0000 0000 0101 11 s 
0000 0000 0101 10 s 
0000 0000 0101 01 s 
0000 0000 0101 00 s 
0000 0000 010011s 
0000 0000 0100 10 s 
0000 0000 0100 01 s 
0000 0000 0100 00 s 
0000 0000 0011 000 s 
0000 0000 0010 111 s 
0000 0000 0010110 s 
0000 0000 0010 101 s 
0000 0000 0010 100 s 
0000 0000 0010 011s 
0000 0000 0010 010 s 
0000 0000 0010 001 s 
0000 0000 0010 000 s 
0000 0000 0011 Ills 
0000 00000011 UOs 
0000 0000 0011 101 s 
0000 0000 0011 100 s 
00000000 0011 oil s 
0000 0000 0011 010 s 
0000 0000 0011 001 s 



0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 



16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

8 

9 

10 

11 

12 

13 

14 



NOTE - The last bit *s' denotes the sign of the level, *0' for positive, * r for negative. 
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Table B-15 — DCT coefficients Table one (concluded) 



Variable length code (NOTE) 



run 



level 



0000 0000 0001 0011 s 
00000000 0001 0010 s 
0000 0000 0001 0001 s 
0000 0000 0001 0000 s 
00(K)0000 0001 0100 s 
0000 0000 0001 1010 s 
0000 0000 0001 1001 s 
0000 0000 0001 1000 s 
0000 0000 0001 0111s 
0000 0000 0001 0110 s 
0000 0000 0001 0101 s 
0000 0000 0001 1111 s 
0000 0000 0001 1110 s 
0000 0000 0001 1101 s 
0000 0000 0001 1100 s 
00000000 0001 1011s 



1 
1 
1 
1 

6 

11 

12 
13 
14 
15 
16 
27 
28 
29 
30 
31 



15 

16 

17 

18 

3 

2 

2 

2 

2 

2 

2 



NOTE - The last bit 's' denotes the sign of the level, '0' for positive, T for negative. 



Table B-16 — Encoding of run and level following an ESCAPE code 



fixed length code 


run 


0000 00 


0 


0000 01 


1 


0000 10 

• • • 

• • • 


2 

» • • 
* • • 


■ • • 

■ • • 

nil 11 


• « • 

63 



fixed length code 



1000 0000 0001 
1000 0000 0010 

• • • 

1111 nil nil 

0000 0000 0000 
0000 0000 0001 



0111 nil nil 



signedjevel 



-2047 
-2046 

• • • 

-1 

forbidden 
+1 

■ * » 

+2047 
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Annex C 
Video buffering verifier 

(This annex forms an integral part of this Recommendation | International Standard) 



Coded video bitstreams shall meet constraints imposed through a Video Bufifering Verifier (VBV) defined 
in this clause. Each bitstream in a scalable hierarchy shall not violate the VBV constraints defined in this 
annex. 

The VBV is a hypothetical decoder, which is conoeptually connected to the output of an encoder. It has 
an input buffer known as the VBV bufier. Coded data is placed in the buffer as defined below in C.3 and 
is removed fi-om the buffer as defined in C.5, C.6, and C.7. It is required that a bitstream that conforms to 
this specification shall not cause the VBV buffer to overflow. When low_delay equals zero, the bitstream 
shall not cause the VBV buffer to underflow. When low.delay equals one, decoding a picture at the 
normally expected time might cause the VBV buffer to underflow. If this is the case the picture is not 
decoded and the VBV buffer is re-examined at a sequence of later times specified in C.7 and C.8 until it is 
all present in the VBV buffer. 

All the arithmetic in Annex C is done with real-values, so that no rounding errors can propagate. For 
example, the number of bits in the VBV buffer is not necessarily an integer. 

C.l The VBV and the video encoder have the same clock firequency as well as the same fi'ame rate, 
and are operated synchronously. 

C.2 The VBV bufier is of size B, v^ere B is the vbyjniffer.size coded in the sequence header and 
sequence extension if present. 

C.3 This clause defines the input of data to the VBV buffer. Two mutually exclusive cases are 

defined in C.3.1 and C.3.2. In both cases the VBV buffer is initially empty. Let Rniax ^ the 

bitrate specified in the bit_rate field. 
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C.3,1 In the case \^ere vbv_delay is coded with a value not equal to hexadecimal FFFF, the picture 
data of the n-th coded picture enters the buffer at a rate R(n) wdiere: 

R(n) = d*n / - 1(11+1) + t(n+l) - t(n) ) 
Where: 

R(n) Is the rate, in bits/s, that the picture data for the n'th coded picture enters the 
VBV. 

d n Is the number of bits after the final bit of the n'th picture start code and before 
and including the final bit of the (n+l)'th picture start code. 

t(n) Is the decoding delay coded in vbv_delay for the n*th coded picture, measured 
in seconds. 

t(n) Is the time, measured in seconds, \^en the n'th coded picture is removed fi-om 
VBV buffer. t(n) is defined in clauses C.9, CIO, C.l 1, and C.12. 

For the bits preceding the first picture start code and following the final picture start code R(n) = 
Rmax 

After filling the VBV buffer with all the data that precedes the first picture start code of the 
sequence and the picture start code itself, the VBV buffer is filled from the bitstream for the time 
specified by the vbv_delay field in the picture header. At this time decoding begins. The data 
input continues at the rates specified in this sub-clause. 

For all bitstreams R(n) <= Rinax for all picture data. 

NOTE - For constant rate video the sequence of values R(n) are constant throughout the sequence 
to within the accuracy pennitted by the quantisation of vbv_delay. 

C.3.2 In the case where vbv_delay is coded with the value hexadecimal FFFF, data enters the VBV 
bufier as specified in this subclause. 

If the VBV buffer is not frill, data enters the buffer at Rmax. 

If the VBV buffer becomes frill after filling at 1^ for some time, no more data enters 
the buffer until some data is removed from the bufier. 

After filling the VBV buffer with all the data that precedes the first picture start code of the 
sequence and the picture start code itself, the VBV buffer is filled from the bitstream until it is 
frill. At this time decoding begins. The data input continues at the rate specified in this sub- 
clause. 

C.4 Starting at the time defined in C.3, the VBV buffer is ^amined at successive times defined in 
C.9 to C. 12. C.5 to C.8 defines the actions to be taken at each time the VBV buffer is examined. 

C.5 This clause defines a requirement on all video bitstreams. 

At the time the VBV buffer is examined before removing any picture data, the number of bits in 
the buEfer shall lie between zero bits and B bits ^ere B is the size of the VBV buffer indicated 
by vbv_buffer_size. 
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For the purpose of this annex, picture data is defined as all the bits of the coded picture, all the 
header(s) and user data immediately preceding it if any (including any stuffing between them) 
and all the stuffing following it, up to (but not including) the next start code, except in the case 
v^ere the next start code is an end of sequence code, in ^ich case it is included in the picture 
data. 




Figure C-1. VBV Buffer Occupancy - Constant bit-rate operation 

C.6 This clause defines a requirement on the video bitstreams when the low_delay fiag is equal to 
zero. 

At each time the VBV buffer is examined and before any bits are removed, all of the data for the 
picture vsiiich (at that time) has been in the buffer longest shall be present in the VBV buffer. 
This picture data shall be removed instantaneously at this time. 

VBV buffer underflow shall not occur wh^ the low_delay flag is equal to 0. This requires that 
all picture data for the n'th picture shall be present in the VBV buffer at the decoding time, t^. 

C.7 This clause only applies when the low_delay flag is equal to one. 

When low_delay is equal to one, there may be situations where the VBV buffer shall be re- 
examined several times before removing a coded picture fi'om the VBV buffer. It is possible to 
know if the VBV buffer has to be re-examined and how many times by looking at the 
temporal_refcrence of the next picture (the one that follows the picture currently to be decoded), 
see 6.3.10. If the VBV buffer has to be re-examined, the picture currently to be decoded is 
referred to as a big picture. 

If picture currently to be decoded is a big picture, the VBV buffer is re-examined at intervals of 2 
field-periods before removing the big picture, and no picture data is removed until the final re- 
examination. 

At this time, the number of bits the VBV buffer immediately before removing flie big- picture 
shall be less than B, all the picture data for the picture that has been in the buffer longest (the big 
picture) shall be present in the buffer and shall be removed instantaneously. Then normal 
operation of the VBV resumes, and C.5 applies. 
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The last coded picture of a sequence shall not be a big picture. 

C.8 This clause is infbnnative only. 

The situation where the VBV buffer would underflow (see C.7) can happen when low-delay 
applications transmit occasionally large pictures, for example in case of scene-cuts. 

Decoding such bitstreams will cause the display process associated with a decoder to repeat a 
previously decoded field or fi^e until normal operation of the VBV can resume. This process is 
sometimes referred to as the occurrence of "skilled pictures". Note that this situation should 
normally not occur except occasionally. It shall not occur when low.delay is equal to 0. 

C.9 This clause defines the time intervals between successive examination of the VBV buffer in the 
case where progressive.sequence equals to 1 and low.delay equals to 0. In this case, the firame 
reordering delay always exists and B pictures can occur. 

The time interval tn+i - tn between two successive examinations of the VBV buffer is a multiple 
of T, where T is the inverse of the fi-ame rate. 

If the n'th picture is a E-picture with repeat_first_field equals to 0, ihm tn+i - tn is equal to T. 

If the n'th picture is a B-Picture with repeat_first_field equals to 1 and tpp_field_first equals 0, 
then - tn is equal to 2'^T. 

If the n'th picture is a B-Picture with repeat_first_field equals to 1 and tqp_field_first equals 1, 
then tn+l - tn is equal to 3*T. 

If the n'th picture is a P-Picture or I-Picture and if the previous P-Picture or I-Picture has 
repeat_first_field equals to 0, then tn+l " ^n is equal to T. 

If the n'th picture is a P-Picture or I-Picture and if the previous P-Picture or I-Picture has 
repeat_^t_field equals to 1 and top_field_^first equal to 0, then tn+i - tn is equal to 2*T. 

If the n'th picture is a P-Picture or I-Picture and if the previous P-Picture or I-Picture has 
repeat_first_field equals to 1 and top_field_first equal to 1, then tn+i - tn is equal to 3*T. 



If tn+i-tn ^^^^^^ ^ detennined with any of the previous paragraphs because the previous P- or I-Picture 

does not exist (which can occur at the beginning of a sequence), then the time interval is arbitrary with the 
following restrictions: 

The time interval between removing one fi:ame (or the first field of a fi^me) and removing the next fiame 
can be arbitrarily defined equal to T, 2*T or 3*T. In this case the delivery rate of the data for the first 
fi"ame is ambiguous. Therefore the VBV buffer status until after this data has been removed fi'om the VBV 
buffo* may have more than one value. At least one of the valid choices for the decoding time shall lead to 
a set of VBV buffer states that meet die requirements of this annex on overflow and underflow. If the 
bitstream is multiplexed as part of a systems bitstream according to Recommendation 
ITU-T H.220.0 I ISO/IEC 13818-1 then information in the systems bitstream may be used to determine 
unambiguously the VBV buffer state after removing the first picture. 



C.IO This clause defines the time intervals between successive examination of the VBV buffer in the 
case where progressive^sequence equals to 1 and low.delay equals to 1. In this case the 
sequence contains no B-Pictures and there is no frame reordering delay. 
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The time interval tn+l - tn between two successive examinations of the VBV buffer is a multiple 
of T, where T is the inverse of the frame rate. 

If the n'th picture is a P-Picture or I-Picture with rq)eat_^rst.field equals to 0, then. - t^ is 
equal toT. 

If the n'th picture is a P-Picture or I-Picture with repeat_first_iie]d equals to 1 and top_field_first 
equals to 0, th^ tn+i - tn is equal to 2*T. 

If the n'th picture is a P-Picture or I-Picture with repeat_first_field equals to 1 and top_field_first 
equals to 1, then tn+i - tn is equal to 3*T. 

C.l 1 This clause defines the time intervals between successive examination of the VBV buffer in the 
case where progressive_sequence equals to 0 and low__deIay equals to 0. In this case, the frame 
reordering delay always exists and B pictures can occur. 

The time interval tn+1 - tn between two successive examinations of the VBV input buffer is a 
multiple of T» where T is the inverse of two times the frame rate. 

If the n'th picture is a frame-stmcture coded B-frame with repeat_first_field equals to 0, then 
tn+1 - tn is equal to 2*T. 

If the n'th picture is a frame-structure coded B-frame with repeat_first_field equals to 1, then 
tn+1 - tn is equal to 3*T. 

If the n'th picture is a field-structure B-picture (B-field picture), then tn+l - tn is equal to T. 

If the n'th picture is a frame-structure coded P-frame or coded I-Frame and if the previous coded 
P-Frame or coded I-Frame has repeat_first_field equals to 0, then tn+1 " ^n is equal to 2*T. 

If the n'th picture is a frame-structure coded P-Frame or coded I-Frame and if the previous coded 
P-Frame or coded I -Frame has repeat_first_field equals to 1, then tn+i - tn is equal to 3*T. 

If the n'th picture is the first field of a field-structure coded P-firame or coded I-Frame, then tn+l 
-tn is equal toT. 

If the n'th picture is the second field of a field-structure coded P-Frame or coded I-Frame and if 
the previous coded P-Frame or coded I-Frame is using field-structure or has repeatjirstjield 
equals to 0, then tn+l " ^n is equal to (2*T - T). 

If the n'th picture is the second field of a field-structure coded P-Frame or coded I-Frame and if 
the previous coded P-Frame or coded I-Frame is using frame-structure and has repeatjirstjield 
equals to 1, then tn+l - ^n is equal to (3*T - T). 

If tn+i'tn ^^^™^t be determined with any of the previous paragraphs because the previous coded P- or I 

frame does not exist (which can occur at the beginning of a sequence), then the time interval is arbitrary 
with the following restrictions: 

The time interval between removing one fi*ame (or the first field of a frame) and removing the next fi^ame 
(or the first field of a fi*ame) can be arbitrarily defined equal to 2*T or 3*T. Therefore the VBV buffer 
status until after this data has been removed from the VBV buffer may have more than one value. At least 
one of the valid choices for the decoding time shall lead to a set of VBV buffer states that meet the 
requirements of this annex on overflow and underflow. If the bitstream is multiplexed as part of a systems 
bitstream according to Recommendation ITU-T H.220.0 1 ISO/IEC 13818-1 then information in the 
systems bitstream may be used to determine unambiguously the VBV buffer state. 
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Figure C-2 

Figure C-2 shows the VBV in a simple case with only frame-pictures. Frames Po, B2 and B4 have a 
display duration of 3 fields. 

C.12 This clause defines the time intervals between successive examination of the VBV buffor in the 
case v/hcrc progressive_sequence equals to 0 and low_deIay equals to 1. In this case the 
sequence contains no B-Pictures and there is no frame reordering delay. 

The time interval tn+i - tn between two successive examinations of the VBV input bufier is a 
multiple of T, where T is the inverse of two times the fi*ame rate. 

If the n*th picture is a frame-structure coded P-Frame or coded I-Frame with repeatjirst_field 
equals to 0, then tn+l - 1^ is equal to 2*T. 

If the n'th picture is a frame-structure coded P-Frame or coded I-Frame with repeat__first_field 
equals to 1, then tn+l ' ^n is equal to 3*T. 

If the n'th picture is a field-structure coded P-Frame or coded I-Frame» then tn+i - tn is equal to 
T. 



Buffer fiillness 




Figure C-: 



Figure C-3 shows the VBV in a simple case with only frame-pictures. Frames lo, P2 and P4 have 
rq)eat_first_field equals to 1. 
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Annex D 
Features supported by the algorit] 



nil 



(This annex does not form an integral part of this Recommendation | International Standard) 

D.l Overview 

The following non-exhaustive list of features is included in this specification: 

1) Different chrominance sampling formats (i.e., 4:2:0, 4:2:2 and 4:4:4) can be represented. 

2) Video in both the progressive and interlaced scan formats can be encoded. 

3) The decoder can use 3:2 pull down to represent a ~24 ips film as ~30 ^s video. 

4) The displayed video can be selected by a movable pan-scan window within a larger raster. 

5) A wide range of picture qualities can be used. 

6) Both constant and variable bitrate chaimels are supported. 

7) A low delay mode for face-to-&ce sqjplications is available. 

8) Random access (for DSM, channel acquisition, and chamiel hopping) is available. 

9) ISO/IEC 111 72-2 constrained parameter bitstreams are decodable. 

10) Bitstreams for high and low (hardware) complexity decoders can be generated. 

11) Editing of encoded video is supported. 

12) Fast-forward and fast-reverse playbadc recorded bitstreams can be implemented. 

1 3) The encoded bitstream is resilient to errors. 

D.2 Video formats 

D.2.1 Sampling formats and colour 

This specification video coding supports both interlaced and progressive video. The respective indication 
is provided with a progressive_sequence flag transmitted in the Sequence Extension code. 

Allowed raster sizes are between 1 and (2^14 - 1) luminance samples each of the horizontal and vertical 
directions. The video is represented in a luminance/chrominance colour space with selectable colour 
primaries. The chrominance can be sampled in either the 4:2:0 (half els many samples in the horizontal 
and vertical directions), 4:2:2 (half as many samples in the horizontal direction only). Furthermore, 
application specific sample aspect ratios and image aspect ratios are flexibly supported. A chroma^format 
parameter is contained in the Sequence Extension code. 

Sample aspect ratio information is provided by means of aspect.ratio.infhrmation and (optional) 
display_horizontal_size and display_vertical_size in the sequence^display^extensionQ. Examples of 
appropriate values for signals sampled in accordance with Recommendation ITU-RBT. 601 are given in 
Table D-1. 
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Table D-1. Example display size values. 



Signal Format 


display_liorizontal_size 


display_vertical_size 


525-line 
625-line 


711 

702 


483 

575 



This specification implements tools to support 4:4:4 chrominance^ for possible fiiture use. However, this is 
currently not supported in any profile. 

D.2.2 Movie timing 

A decoder can implement 3:2 pull down when a sequence of progressive pictures is encoded. Each 
encoded movie picture can independently specify whether it is displayed for two or three video field 
periods, so 'irregular" 3:2 pull down source material can be transmitted as progressive video. Two flags, 
top_field_first and repeat_first_field, are transmitted with the Picture Coding Extensions and adequately 
describe the necessary display timing. 

D.2 J Display format control 

The display process converts a sequence of digital fi*ames (in the case of progressive video) or fields (in 
the case of interlaced video) to ou^ut video. It is not a normative part of the this standard The video 
syntax of this specification does communicate certain display parameters for use in reconstructing the 
video. Opticmal information (in the sequence display extension) specifies the chromaticities, the display 
primaries, the opto>electronic transfer characteristics (e.g., the value of gamma) and the RGB-to- 
luminance/chrominance conversion matrix. 

Moreover, a display window within the encoded raster may be defined as, e.g., in the case of pan and 
scan. Alternatively the encoded raster may be defined as a window on a large area display device. In the 
case of pan-scan the position of the window representing the displayed region of a larger picture can be 
specified on a field-by-field basis. It is specified m the Picture display extension described in 6.3.12, A 
typical use for the pan-scan window is to describe the "'important*' 4:3 aspect ratio rectangle within a 16:9 
video sequence. Similarly, in the case of small encoded pictures on a large display the size of the display 
and the position of the window within that display may be specified. 

D.2.4 Transparent coding of composite video 

Decoding firom PAUNTSC before transmission and re-coding to PAUNTSC after transmission of 
composite source signals in non low quality applications, such as contribution and distribution, requires a 
precise reconstruction of the carrier amplitude and phase reference signal (and v-axis switch for PAL). 

The input format can be indicated in the sequence header using the video_format bits. Possible source 
formats are: PAL, NTSC, SECAM and MAC. Reconstruction of the carrier signal is possible by using the 
carrier parameters: v_axis, field.sequence, sub.carrier, burst_amplitude and sub_carrier_phase that are 
enabled by setting the composite_display_flag in the picture_coding_extensionO. 

D3 Picture quality 

High picture quality is provided according to the bitrate used Provision for very high picture quality is 
made by sufBciently high bitrate limits relating to a certain level in a particular profile. High chroniinance 
band quality can be achieved by using 4:2:2 chrominance 

Quantiser matrices can be downloaded and used with a small a quantiser_scale_code to achieve near 
lossless coding. 



Recommendation TTU-T H^62 (1995 £) 



177 



ISO/IEC 1381»-2: 1995 (£) 



Moreover, scalable codisg with flexible bitrate allows for service or quality hierarchy and graceful 
degradation. £.g., decoding a subset of the bitstream carrying a lower resolution picture allows for 
decoding this signal in a low-cost receiver with related quality; decoding the complete bitstream allows to 
obtain the high overall quality. 

Furthermore, operation at low bitrates can be acconunodated by using low frame rates (by either pre- 
processing before coding or frame skipping indicated by the temporal_reference in the picture header) and 
low spatial resolution. 

D.4 Data rate control 

The number of transmitted bits per unit time, which is selectable in a wide range, may be controlled in 
two ways, which are both supported by this specification. A bit.rate description is transmitted with the 
Sequence Header Code. 

For constant bitrate (CBR) coding, the number of transmitted bits per unit time is constant on the channel. 
Since the encoder output rate generally varies depending on the picture content, it shall regulate the rate 
constant by bufTering etc. In CBR, picture quality may vary depending on its content. 

The other mode is the variable bitrate (VBR) coding, in vAiich case the number of transmitted bits per unit 
time may vary on the channel under some constriction. VBR is meant to provide constant quality coding. 
A model for VBR application is near-constant-quality coding over B-ISDN channels subject to Usage 
Parameter Control (UPC). 

D.5 Low delay mode 

A low encoding and decoding delay mode is accommodated for real-time video communications such as 
visual telephony, video-conferencing, monitoring. Total encoding and decoding delay of less than 150 
milliseconds can be achieved for low delay mode operation of this specification. Setting the low_delay flag 
in the Sequence Header code defines a low delay bitstream. 

The total encoding and decoding delay can be kept low by generating a bitstream >^ich does not contain 
B-pictures. This prevents fi'ame reordering delay. By using dual-prime prediction for coded P-frames the 
picture quality can still be high. 

A low buffer occupancy for both encoder and decoder is needed for low delay. Large coded pictures should 
be avoided by the encoder. By using intra update on the basis of one or more slices per frame (intra slices) 
instead of intra frames this can be accommodated. 

In case of exceeding, for low delay operation, the desired number of bits per frame the encoder can skip 
one or more frames. This action is indicated by a discontinuity in the value of temporal .reference for the 
next picture (see the semantic definition in 6.3.9) and may cause C.7 of the VBV to apply, i.e. the decoder 
buffer would underflow if some fitunes are not repeated by the decoder. 

D.6 Random access/channel hopping 

The syntax of this specification supports random access and channel hopping. Sufficient random 
access/channel hopping functionality is possible by encoding suitable random access points into the 
bitstream without significant loss of image quality. 

Random access is an essential feature for video on a storage medium. It requires that any picture can be 
accessed and decoded in a limited amoimt of time. It implies the existence of access points in the 
bitstream — that is segments of information that are identifiable and can be decoded without reference to 
other segments of data. In this specification access points are provided by sequence.headorO and this is 
then followed by intra information (picture data that can be decoded without access to previously decoded 



178 



Recommendation ITU-T H.262 (1995 £) 



© ISO/IEC 



ISO/IEC 13818-2: 1995 (£) 



pictures). A spacing of two random access points per second can be achieved without significant loss of 
picture quality. 

Channel hopping is the similar situation in transmission applications such as broadcasting. As soon as a 
new channel has been selected and the bitstream of the selected channel is available to the decoder, the 
next data entry, i.e. random access point has to be foimd to start decoding the new program in the manner 
outlined in the previous paragraph. 



D*7 Scalability 

The syntax of this specification supports bitstream scalability. To acconomodate the diverse fimctionality 
requirements of the applications envisaged by this specification a number of bitstream scalability tools 
have been developed: 

• SNR scalability mainly targets for applications vMch require gracefiil degradation. 

• Chroma simulcast targets at applications with high chrominance quality requiremrats. 

• Data partitloiiuig is primarily targeted for cell loss resilience in ATM networks. 

• Temporal scalability is a method suitable for interworldng of services using high temporal 
resolution progressive video formats. Also suitable for high quality graceiul degradation in the 
presence of channel errors. 

• Spatial scalability allows multiresolution coding technique suitable for video service 
interworldng applications. This tool can also provide coding modes to achieve compatibility with 
existing coding standards, i.e. ISO/IEC 1 1 172-2, at the lower layer. 



D.7.1 Use of SNR scalability at a single spatial resolution 

The aim of SNR scalability is primarily to provide a mechanism for transmission of a two layer service, 
these two layers providing the same picture resolution but different quality level. For example, the 
transmission of service with two different quality levels is expected to become usefiil in the fiiture for some 
TV broadcast applications, especially when very good picture quality is needed for large size display 
receivers. The sequence is encoded into two bitstreams called lower and enhancement layer bitstreams. 
The lowCT layer bitstream can be decoded independently fi-om the enhancement layer bitstream. The lower 
layer, at 3 to 4 Mbit/s, would provide a picture qtiality equivalent to the current NTSC/PAL/SECAM 
quality. Then, by using both the lower and the enhancement layer bitstreams, an enhanced decoder can 
deliver a picture quality subjectively close to the studio quality, with a total bitrate of 7 to 12 Mbit/s. 



D.7.1.1 Additional features 



D.7.1.1.1 Error resilience 

As described in D. 12 the SNR scalable scheme can be used as a mechanism for error resilience. If the two 
layer bitstreams are received with different error rate, the lower layer, betto* protected, stands as a good 







Mil 





D.7.1.1.2 Chroma simulcast 

The SNR scalable syntax can be used in a chroma simulcast system. The goal of such a scheme would be 
to provide a mechanism for simultaneous distribution of services with the same limiinance resolution but 
different chrominance sampling format (e.g. 4:2:0 in the lower layer and 4:2:2, v/hen adding the 
enhancement layer and the simulcast chrominance components) for applications vMch would require 
such a feature. The SNR scalable enhancement layer contains some luminance refinement The 4:2:2 
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chrominaiice is sent in simulcast Only chrominance DC is predicted from the lower layer. The 
combination of both layer liiminance and of the 4:2:2 chrominance constitutes the high quality level. 

D.7.1.2 SNR scalable encoding process 

D.7.1^.1 Description 

In the lower layer, the encoding is similar to the non scalable situation in terms of decisions, adaptive 
quantisation, buffer regulation. The intra or error prediction macroblocks are EXJT transformed. The 
coefficients are then quantised using a first rather coarse quantiser. The quantised coefficients are then 
VLC coded and sent together with the required side information (macroblock.type, motion vectors, 
coded.blockLpattemO)- 

In parallel, the quantised DCT coefficients coming from the lower layer, are dequantised. The residual 
error between the coefficients and the dequantised coefficients is then re-quantised, using a second finer 
quantiser. The resulting refinement coefficients are VLC coded and form the additional enhancement 
layer, together with a marginal amount of side information (quantiser^scale.code, 
coded^Uock^attemO.-.)- The non-intra VLC table is used for all the coefficients in the enhancement 
layer, since it is of differential nature. 

D.7.1 2.2 A few important remarks 

Since the prediction is the same for both layers, it is recommended to use the refined images in the motion 
estimation loop (e.g. the images obtained by the conjunction of the lower and the enhancement layer). 
Thus, there is a drift between the prediction used at the encoder side and what the low level decoder can 
get as a prediction. This drift does accumulate from P-picture to P-picture and is reset to zero at each I- 
Picture. However the drift has been found to have little visual effect ^en there is an I- picture every 15 
pictures or so. 

Since the enhancement layer only contains refinement coefficients, the needed overhead is quite reduced: 
most of the information about the ma^oblocks (macroblock types, motion vectors...) are included in the 
lower layra'. Therefore the syntax of this stream is very much simplified: 

- the macroblock type table only indicates if the quantiser_scale_code in the enhancement layer 
has changed and if the macroblock is NOT-CODED (for first and last macroblock of the slices), which 
amounts to three VLC words. 

- quantiser_scale_code in the enhancement layer is sent if the value has changed. 

- coded.block^attemQ is transmitted for all coded macroblocks. 

All NON-CODED macroblocks that are not at the begiiming or end of a slice are skipped, since the 
overhead information can be deduced from the lower layer. 

It is recommended to use different weighting matrices for the lower and the enhancement layer. Some 
better results are obtained when the first quantisation is steeper than the second one. However it is 
recommended not to quantise too coarsely the DCT coefficient that corresponds to the interlace motion, to 
avoid juddering effects. 

D.7.2 Multiple resolution scalability bitstreams using SNR scalability 

The aim of resolution scalability is to decode the base layer video suitable fiir display at reduced spatial 
resolution. In addition it is desirable to implement a decoder with reduced complexity for this purpose. 
This functionality is usefiil for applications where the receiver display is either not capable or willing to 
display the fiill spatial resolution supported by both layers and for applications v^ere software decoding is 
targeted. The method described in this clause uses die SNR Scalability syntax outlined in clause 7 to 
transmit the video in two layers. Note that none of the options suggested in this clause changes the 



180 



Recommendation ITU-T H.262 (1995 £) 



© ISO/EEC 



ISO/mC 1381S-2: 1995 (E) 



structure of the highest resolution decoder, which remains identical to the one outlined in Figure 7-14. 
The bitstream generated on both layers is compatible with the HIGH profile. However, the base layer 
decoder could be implemented differently with reduced implementation complexity suitable to soitM^e 
decoding. 



In decoding to a smaller spatial resolution, an inverse DCT of reduced size could be used when decoding 
the base layer. The frame memory requirement in the decoder MC loop would also be reduced 
accordingly. 

If the bitstream of the two SNR Scalability layers was generated with only one MC loop at the encoder the 
base video will be subject to drift. This drift may or may not be acceptable depending on the application. 
Image quality will, to a large extent, depend on the sub-sample accuracy used for motion compensation in 
the decoder. It is possible to use the full precision motion vector as transmitted in the base layer for 
motion compensation with a sub-sample accuracy comparable to that of the higher layer. Mft can be 
minimised by using advanced sub-sample interpolation filters (see [12], [13] and [16] in Annex G). 

D.7.2.2 Encoder implementation 

It is possible to tailor the base layer SNR Scalability bitstream to the particular requirements of the 
resolution scaled decoder. A smaller DCT size can be more easily supported by only transmitting the 
appropriate DCT-coefficients belonging to the appropriate subset in die base layer bitstream. 

Finally it is possible to support a drift-free decoding at lower resolution scale by incorporating more than 
one MC loop in the encoder scheme. An identical reconstruction process is used in the encoder and 
decoder . 

D.73 Bitrate allocation in data partitioning 

Data partitioning allows splitting a bitstream for increased error resilience ^en two channels with 
different error performance are available. It is often required to constrain the bitrate of each partition. This 
can be achieved at the encoder by adaptively changing priority breakpoint at each slice. 

* 

The encoder can use two virtual buffers for the two bitstreams, and implement feedback rate control by 
picking a priority breakpoint that approximately meets the target rate for each channel. Difierence 
between target and actual rates is used to revise the target for the next fiame in a feedback loop. 

It is desirable to vary the bitrate split from frame to fi^e for higher error resilience. Typically, I-pictures 
benefit from having more of the data in partition 0 than the P-pictures \>tiiile B-pictures could be placed 
entirely in partition 1 . 

D.7.4 Temporal scalability 

A two layer temporally scalable coding structure consisting of a base and an enhancement layer is 
shown in Figure D-1 . Consider video input at full temporal rate to temporal demultiplexer; in our example 
it is temporally demultiplexed to form two video sequences, one input to the base layer encoder and the 
other input to the enhancement layer encoder. The base layer encoder is a non hierarchical encoder 
operating at half temporal rate, the enhancement layer encoder is like a MAIN profile encoder and also 
operates at half temporal rate except that it uses base layer decoded pictures for motion compensated 
prediction. The encoded bitstreams of base and enhancement layers are multiplexed as a single stream in 
the systems multiplexer. The systems demultiplexer extracts two bitstreams and inputs, corresponding 
bitstreams to base and enhancement layer decoders. The output of the base layer decoder can be shown 
standalone at half temporal rate or after multiplexing with enhancement layer decoded fi^es and shown 
at full temporal rate. ' 



D.7.2.1 



Decoder implementation 
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Figure D-1. A two layer codec structure for temporal scalability 

The following forms of temporal scalability are supported and are expressed as higher layer: base layer-to- 
enhancement layer picture formats. 

1. Progressive: progressive-to-progressive Temporal Scalability 

2. Progressive: interlace-to-interlace Temporal Scalability 

3. Interlace: interlace-to-interlace Temporal Scalability 

D.7.4.1 Progressive: progressive-to-progressive temporal scalability 

Assuming progressive video input, if it is necessary to code progressive- format video in base and 
enhancement layers, the operation of temporal demux may be relatively simple and involve temporal 
demultiplexing of input frames into two progressive sequences; The operation of temporal remux is 
inverse, i.e., it performs remultiplexing of two progressive sequences to generate full temporal rate 
progressive output. See Figure D-2. 
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Figure D-2. Temporal demultiplexer and remultiplezer for 
progressive: progressive-to-progressive temporal scalability 



D.7.4.2 Progressive: interlace-to-interlace temporal scalability 

Again, assuming fiill temporal rate progressive video input, if it is necessary to code interlaced format 
video in base layer, the operation of temporal demux may involve progressive to two interlace conversion; 
this process involves extraction of a normal interlaced- and a complementary interlaced sequence from 
progressive input video. The operation of temporal remux is inverse, i.e., it performs two interlace to 
progressive conversion to generate full temporal rate progressive output Figure D-3 and Figure I>4 show 
operations required in progressive to two interlace and two interlace to progressive conversion. 
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Figure D-3. Progressive to two interlace conversion. 
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Figure D-5. Temporal demultiplexer and remultiplexer for 
progressive: interlace-to-interlace temporal scalability 



D.7.43 Interlace: interlace-to-interlace temporal scalability 

Assuming interlaced video input, if it is necessary to code interlaced- format video in base and 
enhancement layers, the operation of temporal demux may be relatively simple and involve temporal 
demultiplexing of input frames into two interlaced sequences; The operation of temporal remux is 
inverse, i.e., it performs remultiplexing of two interlaced sequences to generate fidl temporal rate 
interlaced output. The demultiplexing and remultiplexing is similar to that in Figure D-2. 

D.7.5 Hybrids of the spatial, the SNR and the temporal scalable extensions 

This standard also allows combinations of scalability tools to produce more than 2 video layers as may be 
useful and practical to support more demanding applications. Taken two at a time, 3 explicit 
combinations result. Moreover, within each combination, the order in which each scalability is applied, 
when interchanged, results in distinct applications. In the hybrid scalabiUties involving three layers, the 
layers are referred to as base layer, enhancement layer 1 and enhancement layer 2. 

D.7.5.1 Spatial and SNR hybrid scalability applications 

A) HDTV with standard TV at two qualities: 

Base layer provides standard TV resolution at basic quality, enhancement layer 1 helps generate standard 
TV resolution but at higher quality by SNR scalability and the enhancement layer 2 employs HDTV 
resolution and format which is coded with spatial scalability with respect to high quality standard TV 
resolution generated by using enhancement layer 1 . 

B) Standard TV at two qualities and low definition TV/videophone: 

Base layer provides videophone/low definition quality, using spatial scalability enhancement layer 1 
provides standard TV resolution at a basic quality and enhancement layer 2 uses SNR scalability to help 
generate high quality standard TV. 

C) HDTV at two qualities and standard TV: 

Base layer provides standard TV resolution. Using spatial scalability enhancement layer 1 provides basic 
quality HDTV and enhancement layer 2 uses SNR sc^ability to help generate high quality HDTV. 

D.7.5.2 Spatial and temporal hybrid scalability applications 

A) High temporal resolution progressive HDTV with basic interlaced HDTV and standard TV: 
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Base layer provides standard TV resolution, using spatial scalability enhancement layer 1 provides basic 
HDTV of interlaced format and enhancement layer 2 uses temporal scalability to help generate iuU 
temporal resolution progressive HDTV. 

B) High resolution progressive HDTV with enhanced progressive HDTV and basic progressive HDTV: 

Base layer provides basic progressive HDTV format at temporal resolution, using temporal scalability 
enhancement layer 1 helps generate progressive HDTV at full t^poral resolution and enhancement layer 
2 uses spatial scalability to provide high spatial resolution progressive HDTV (at full temporal 
resolution). 

C) High resolution progressive HDTV with enhanced progressive HDTV and basic interlaced HDTV: 

Base layer provides basic interlaced HDTV format, using temporal scalability enhancement layer 1 helps 
generate progressive HDTV at full temporal resolution and enhancement layer 2 uses spatial scalability to 
provide high spatial resolution progressive HDTV (at fidl temporal resolution). 

D.7.5 J Temporal and SNR hybrid scalability applications 

A) Enhanced progressive HDTV with basic progressive HDTV at two qualities: 

Base layer provides basic progressive HDTV at lower temporal rate, using temporal scalability 
enhancement layer 1 helps generate progressive HDTV at full temporal rate but with basic quality and 
enhancement layer 2 uses SNR scalability to help generate progressive HDTV with high quality (at full 
temporal resolution). 

B) Enhanced progressive HDTV with basic interlaced HDTV at two qualities: 

Base layer provides interlaced HDTV of basic quality, using SNR scalability enhancement layer 1 helps 
generate interlaced HDTV at high quality and enhancement layer 2 uses temporal scalatnUty to help 
generate progressive HDTV at full temporal resolution (at high quality). 

D.8 Compatibility 

The standard supports compatibility between different resolution formats as well as compatibility with 
ISO/mC 1 1 172-2 (and Recommendation ITU-T tt261). 

D.8.1 Compatibility with higher and lower resolution formats 

This specification supports compatibility between different resolution video formats. Compatibility is 
provided for spatial and temporal resolutions with the Spatial Scalability and Temporal Scalability tools. 
The video is encoded into two resolution layers. A decoder only capable or willing to display a lower 
resolution video accepts and decodes the lower layer bitstream. The full resolution video can be 
reconstructed by accq)ting and decoding both resoluticm layers provided. 

D.8.2 Compatibility with ISO/IEC 11172-2 (and Recommendation ITU-T H.261) 

The syntax of this specification supports both backward and forward compatibility with ISO/IEC 1 1 172-2. 
Forward compatibility with ISO/IEC 11172-2 is provided since the syntax of this specification is a 
superset of the ISO/IEC 1 1 172-2 syntax. The Spatial Scalability tool provided by this specification allows 
using ISO/IEC 11172-2 coding in the lower resolution, i.e. base layer, thus achieving backward 
compatibility. 

The video syntax contains tools that are needed to implement H.261 compatibility that may be needed for 
possible future use, however, this is currendy not supported by any profile. 

Simulcast serves as a simple alternative method to provide backward compatibility with both IL261 and 
ISO/IEC 11172-2. 
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D.9 



Differences between this specification and ISO/IEC 11172-2 



This clause lists the differences between MPEG-1 Video and MPEG-2 Video. 

All MPEG-2 Video decoders that comply with currently defined profiles and levels are required to decode 
MPEG-1 constrained bitstreams. 

In most instances, MPEG-2 represents a super-set of MPEG-l. For example, the MPEG-1 coefficient 
zigzag scanning order is one of the two coefficient scanning modes of MPEG-2. However, in some cases, 
there are syntax elements (or semantics) of MPEG-1 that does not have a direct equivalent in MPEG-2. 
This document lists all those elements. 

This document may help implementers identify those elements of the MPEG-1 video syntax (or semantics) 
that do not have their direct equivalent in MPEG-2, and therefore require a special care in order to have 
guarantee MPEG-1 compatibility. 

In this clause, MPEG-1 refers to ISO/IEC 1 1 172-2 whilst MPEG-2 refers to this specification. 
D.9.1 mCT mismatch 

MPEG-1 - The IDCT mismatch control consists in adding (or removing) one to each non-zero coefficient 
that would have been even afler inverse quantisation. This is described as part of the inverse quantisation 
process, in 2.4.4.1, 2.4.4.2 and 2.4.4.3 of MPEG-1. 

MPEG-2 - The IDCT mismatch control consists in adding (or removing) one to coefficient [7] [7] if the 
sum of all coefficients is even after inverse quantisation. ^This is described in 7.4.4 of MPEG-2. 

D.9.2 Macroblock stuffing 

MPEG-1 - The VLC code '0000 0001 1 1 T (macroblock_stufi5ng) can be inserted any number of times 
before each macroblock_address_increment. This code must be discarded by the decoder. This is 
described in 2.4.2.7 of MPEG-1. 

MPEG-2 - This VLC code is reserved and not used in MPEG-2. In MPEG-2, stuffing can be genwated 
only by inserting zero bytes before a start-code. This is described in 5.2.3 of MPEG-2. 

D.93 Run-level escape syntax 

MPEG-1 - Run-level values that cannot be coded with a VLC are coded by the escape code '0000 01* 
followed by either a 14-bit PLC (-127 <= level <= 127), or a 22-bit PLC (-255 <= level <= 255). This is 
described in Annex B, 2-B5 of MPEG-1. 

MPEG-2 - Run-level values that cannot be coded with a VLC are coded by the escape code '0000 01' 
followed by a 18-bit PLC (-2047 <= level <= 2047). This is described in 7.2.2.3 of MPEG-2. 

D.9.4 Chrominance samples horizontal position 

MPEG-1 - The horizontal position of chrominance samples is half the way between luminance samples. 
This is described in 2.4.1 of MPEG-1. 

MPEG-2 - The horizontal position of chrominance samples is co-located with luminance samples. This is 
described in 6.1.1.8 of MPEG-2. 

D.9.5 Slices 

MPEG-1 - Slices do not have to start and end on the same horizontal row of macroblocks. Consequently 
it is possible to have all the macroblocks of a picture in a single slice. This is described in 2.4.1 of 
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MPEG-2 - Slices always start and end on the same horizontal row of macroblodcs. This is described in 
6.1.2 of MPEG-2. 

D.9.6 D-Pictures 

MPEG-1 - A special syntax is defined for D-pictures (picture.codingjtype = 4). D-pictures are like I- 
pictures with only Intra-DC coefficients, no End of Block, and a special end.ofjnacroblock code * 1 

MPEG-2 - D-pictures (picture_coding_type = 4) are not permitted. This is described in 6.3.9 of MPEG-2. 
D.9.7 Full-pel motion vectors 

MPEG-1 - The syntax elements full_peLforward_vector and iullj)eLbackward_vectQr can be set to '1'. 
When this is the case, the motion vectors that are coded are in full-pel units instead of half-pel units. 
Motion vector coordinates must be multiplied by two before being used for the prediction. This is 
described in 2.4.4.2 and 2.4.4.3 of MPEG-1. 

MPEG-2 - The syntax elements fiill_pel_forward_vector and full_peLbackward_vector must be equal to 
'0'. Motion vectors are always coded in half-pel units. 

D.9.8 Aspect ratio information 

MPEG-1 - The 4-bit peLaspect_ratio value coded in the sequence header specifies the pel aspect ratio. 
This is described in 2.4.3.2 of MPEG-1. 

MPEG-2 - The 4-bit aspect_jratio_informatiQn value coded in the sequence header specifies the display 
aspect ratio. The pel aspect ratio is derived from this and fi'om the fiame size and display size. This is 
described in 6.3.3 of MPEG-2. 

D.9.9 forward_f_code and backward_C.code 

MPEG-1 - The f_code values used for decoding the motion vectors are forward_.Ccode and 
backward_£.code, located in the picture JieaderQ. 

MPEG-2 - The f_code values used for decoding the motion vectors are fLcode[s][t], located in the 
picture_coding_extensionO. The values of forward_£„code and backwardJLcode must be '1 1 T and are 
ignored. This is described in 6.3.9 of MPEG-2. 

D.9.10 constrained_parameter_flag and maximum horizontal^size 

MPEG-1 - When the constrained_4)arameter_flag is set to 'T, this indicates that a certain number of 
constraints are verified. One of those constraints is that horizontal_size <= 768. It should be noted that a 
constrained MPEG-1 video bitstream can have pictures with an horizontal size of up to 768 pels. This is 
described in 2.4.3.2 of MPEG-L 

MPEG-2 - The constramed_parameter_flag mechanism has been replaced by the profile and level 

mechanism. However, it should be noted that MP@ML bitstreams cannot have horizontal size larger than 
720 pels. This is described in 8.2.3.1 of MPEG-2. 

D.9.11 bit_ratc and vbv_delay 

MPEG-1 - bit_rate and vbv_delay are set to 3FFFF and FFFF (hex) respectively to indicate variable 
bitrate. Other values are for constant bitrate. 

MPEG-2 - The semantics for bit_rate are changed In variable bitrate operation, vbv_delay may be set to 
FFFF (hex), but a different value does not necessarily mean that the bitrate is constant. Constant bitrate 
operation is simply a special case of variable bitrate q)eration. There is no way to tell that a bitstream is 
constant bitrate without examining all of the vbv_delay values and making complicated computations. 
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Even if the bitrate is constant the vahie of bit_rate may not be the actual bitrate since bit.rate need only be 
an upper bound to the actual bitrate. 

D.9.12 VBV 

MPEG-1 - VBV is only defined for constant bitrate operation. The STD supersedes the VBV model for 
variable bitrate operation. 

MPEG-2 - VBV is only defined for variable bitrate operation. Constant bitrate operation is viewed as a 
special case of variable bitrate operation. 

D.9.13 temporal.reference 

MPEG-1 - temporaljreference is incremented by one modulo 1024 for each coded picture, and reset to 
zero at each group of pictures header. 

MPEG-2 - If there are no big pictures, temporal_reference is incremented by one modulo 1024 for each 
coded picture, and reset to zero at each group of pictures header (as in MPEG-1). If there are big pictures 
(in low delay bitstreams), then temporal^eference follows different rules. 

D.9.14 MPEG-2 syntax vs. MPEG-1 syntax 

It is possible to make MPEG-2 bitstreams that have a syntax very close to MPEG-1, by using particular 
values for the various MPEG-2 syntax elements that do not exist in the MPEG-1 syntax. 

In other words, the MPEG-1 decoding process is the same (except for the particular points mentioned 
earlier) as the MPEG-2 decoding process ^en : 

progressive_sequence = ' 1* (progressive sequence). 

chronuL.fbrmat = '01' (4:2:0) 

fi^e_rate_extension_n = 0 and firamejrate_extension_d = 0 (MPEG-1 fi'ame-rate) 

intra^dc^recision = *00' (8-bit Intra-DC precision) 

picture_structure = '11' (fiame-picture, because progressive_sequence ' 1 ') 

firame.pred.fi'ame.dct - 1 (only frame-based prediction and firame DC!) 

concealment jmotion^vectors = '0' (no concealment motion vectors). 

q_scale_type - '0' (linear quantiser_scale) 

intra_vlc_format = *0' (MPEG-1 VLC table for Intra MBs). 

altemate.scan = *0' (MPEG-1 zigzag scanning order) 

rq)eat_first_field = *0* (because progressive^sequence = T) 

chroma_420_type = ' 1' (chrominance is "fiame-based", because 

progressive_sequence = * T) 

progressive_fi^me = * T (because progressive_sequence = ' T) 

D.IO Complexity 

The MPEG-2 standard supports combinations of high performance/high complexity and low 
performance/low complexity decoders. This is accommodated by MPEG-2 with the Profiles and Levels 
definitions which introduce new sets of tool and fimcdonality widi every new profile. It is thus possible to 
trade-off performance of the MPEG-2 coding schemes by decreasing implementation complexity. 
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Moreover, certain restricdons could allow reducing decoder implementation cost 

D.ll Editing encoded bitstreams 

Many operations on the encoded bitstream are supported to avoid the expense and quality costs of re- 
coding. Editing, and concatenation of encoded bitstreams with no re-coding and no disruption of the 
' decoded image sequence is possible. 

There is a conflict between the requirement for high compression and easy editing. The coding structure 
and syntax have not been designed with the primary aim of simplifying editing at any picture. 
Nevertheless a number of features have been included that enable editing of coded data. 

Editing of encoded MPEG-2 bitstreams is supported due to the syntactic hierarchy of the encoded video 
bitstream. Unique start codes are encoded with different level in the hierarchy (i.e. video sequence, group 
of pictures etc.). Video can be encoded with Intra-picture/intra-slices access points in the bitstream. This 
enables the identification, access and editing of parts of the bitstream without the necessity to decode the 
entire video. 

D.12 Trick modes 

Certain DSM (Digital Storage Media) provide the capability of trick modes, such as FF/FR (Fast 
Forward/Fast Reverse). The MPEG-2 syntax supports all special access, search and scan modes of 
ISO/IEC 11172-2. This functionality is supported with the syntactic hierarchy of the video bitstream 
which enables the identification of relevant parts within a video sequence. It can be assisted by MFEG-2 
tools which provide bitstream scalability to limit the access bitrate (i.e. Data Partitioning and the general 
slice structure). This clause provides some guideline for decoding a bitstream provided by a DSM. 

The decoder is informed by means of a 1-bit flag (DSM_tric]^jnode_flag) in the PES packet header. This 
flag indicates that the bitstream is reconstructed by DSM in trick mode, and the bitstream is valid fi'om 
syntax point of view, but invalid fi'om semantics point of view. When this bit is set, an 8-bit field 
(DSM_trick^modes) follows. The semantics of DSM_trick^modes are in the ISO/IEC 13818-1. 

D.12.1 Decoder 

While the decoder is decoding PES Packet whose DSMjtrickjmodejQag is set to 1, the decoder is 
recommended to: 

Decode bitstream and display according to DSMjtrick^modes 
Pre-processing 

When the decoder encounters PES Packet whose DSNL.trickjnode_flag is set to 1, the decoder is 
reccmunended to: 

Clear non trick mode bitstream fi-om buffer 
Post-processing 

When the decoder encounters PES Packet whose DSM_trick^mode_flag is set to 0, the decoder is 
recommended to: 

Clear trick mode bitstream from bufler 
Video Part 

While the decoder is decoding PES Packet whose DSMjtrick;_mode_flag is set to 1, the decoder is 
recoomiended to: 
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Neglect vbv.delay and tempQral_referenoe value 

Decode one picture and display it until next picture is decoded. 

The bitstream in trick mode may have a gap between slices. When the decoder encounters a gap between 
slices, the decoder is recommended to: 

Decode the slice and display it according to the slice vertical position in slice header 

Fill up the gap with co-sited part of the last displayed picture 

D.12.2 Encoder 

The encoder is recommended to: 

Encode with short size of slice with intra macroblocks. 

Encode with short periodic refreshment by intra picture or intra slice. 

DSM 

DSM is recommended to provide the bitstream in trick mode with perfect syntax. 

Pre-processing 

DSM is recommended to: 

Complete ^'normal'* bitstream at picture.headerO and higher syntactic structures. 

System Part 

DSM is recommended to: 

Set DSM_trick_modc_flag to 1 in a PES Packet header. 
Set DSM_trick_modes(8-bit) according to the trick mode. 

Video Part 

DSM is recommended to: 

Insert a sequoice.headerO with the same parameters as a normal bitstream. 

Insert a sequence_extensionO with the same parameters as a normal bitstream. 

Insert a picture^headerQ with the same parameters as a normal bitstream except that it may be 
preferable to indicate variable bit rate operation. One way to achieve this is to set vbv_delay to 
FFFF(hex). 

NOTE - In most cases temporal^eference and vbv^delay are ignored in a decode, therefore the DSM 
may not need to set temporaLreference and vbv_delay to correct values. 

Concatenate slices which consists of intra coded macroblocks. The concatenated slices should 
have slice vertical positions in increasing order. 

D.13 Error resilience 

Most digital storage media and commvmication channels are not error-free. Appropriate channel coding 
schemes should be used and are b^ond the scope of this specification. Nevertheless the MPEG-2 syntax 
supports error resilient modes relevant to cell loss in ATM networks and bit errors (isolated and in bursts) 
in transmissions. The slice structure of the compression scheme defined in this specification allows a 
decoder to recover after a residual data error and to resynchronise its decoding. Therefore, bit errors in the 
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coded data will cause errors in the decoded pictures to be limited in area. Decoders may be able to use 
concealment strategies to disguise these errors. Error resilience includes graceful degradation in 
proportion to bit error rate (BER) and graceful recovery in the face of missing video bits or data packets. It 
has to be noted that all items may require additional support at the system level. 

Being an example of a packet-based system, B-ISDN with its Asynchronous Transfer Mode (ATM) is 
addressed in some detail in the following. Similar statements can be made for other systems where certain 
packets of data are protected individually by means of forward error-correcting coding. 

ATM uses short, fixed length packets, called cells, consisting of a 5 byte header containing routing 
information, and a user pa)1oad of 48 bytes. The nature of errors on ATM is such that some cells may be 
lost, and the user pa>^oad of some ceUs may contain bit errors. Depending on AAL (ATM Adaptation 
layer) functionality, indications of lost cells and cells containing bit errors may be available. 

As an indication of the impact of cell loss in an ATM environment Table D-2 summarises the average 
interval betweoi cell losses for a range of CLR and service bitrates based on simple statistical modelling. 
(A cell payload must be assumed for this. Allowing 1 byte/cell for AAL functions leaves 376 bits = 47 
bytes). Note, however, that this simmiary ignores cell loss bursts and other shorter term temporal 
statistics. 



Table D-2. Average interval between cell losses for a range of CLR and service bitrates. 
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Bit Error Ratios (BERs) corresponding to the above mean times between errors can be calculated easily for 
the case of isolated bit errors. The BER that would cause the same incidence rate of errors is found by 
dividing by the cell payload size. i.e. BER = CLR/376. 

The following techniques of minimising the impact of lost cells and other error/loss effects are provided 
for reference, and indicate example methods of using the various tools available in this specification to 
provide good performance in the presence of those errors. Note that the techniques described may be 
applicable in Uie cases of packets of other sizes (e.g. LANs or certain storage media) or video data with 
uncorrected errors of different characteristics, in addition to cell loss. It may be appropriate to treat a 
known erasure (uncorrected bit error(s) known to exist somewhere in a data block) as a lost data block, 
since the impact of bit errors cannot be predicted. However, this should be a decode* option. The 
discussion that follows refers generally to "transport packets" v/hcre appropriate, to emphasise the 
applicability to a variety of transport and storage systems. However, specific examples will refer to Cell 
Loss Ratios (CLRs) because cell transport is the most completely defined at the time of prq)aring this 
specification. 
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The error resilience techniques are summarised in three categories, covering methods of concealing the 
error once it has occurred, and the restriction of the influence of a loss or error in both space (within a 
picture) and time (from picture to picture). 



Concealment techniques hide the efifect of losses/errors once they have occurred. Some concealment 
methods can be implemented using any encoded bitstream, while others are reliant on the encoder to 
structure the data or provide additional information to enable enhanced performance. 

D.13.1.1 Temporal predictive concealment 

A decoder can provide concealment of the errors by estimating the lost data from spatio-temporally 
adjacent data. The decoder uses information which has been successfully received to make an informed 
estimate of what should be displayed in place of the lost/errored data, under the assumption that the 
picture characteristics are &irly similar across adjacent blocks (in both the spatial and temporal 
dimensions). In the temporal case, this means estimation of errored or lost data from nearby fields or 
frames. 

D.13.1.1.1 Substitution from previous frame 

The simplest possible approach is to rq)lace a lost macroblock with the macroblock in the same location 
in the previous picture. This approach is suitable for relatively static picture areas but block displacement 
is noticeable for moving areas. 

The '"previous picture" must be interpreted with care due to the use of bi-directional prediction and a 
difference between picture coded order and picture display order. When a macroblock is lost in a P- or I- 
picture, it can be concealed by copying the data corresponding to the same macroblock in the previous P- 
picture or I-picture. This ensures that the picture is complete before it is used for further prediction. Lost 
macroblocks in B-pictures can be substituted from the last displayed picture, of any type, or from a future 
I- or P-picture held in memory but not yet displayed 

D.13.1.1.2 Motion compensated concealment 

The concealment from neighbouring pictures can be improved by estimating the motion vectors for the 
lost macroblock, based on the motion vectors of neighbouring macroblocks in the affected picture 
(provided these are not also lost). This improves the concealment in moving picture areas, but there is an 
obvious problem with errors in macroblocks whose neighbouring macroblocks are coded intra, because 
there are ordinarily no motion vectors. Encoder assistance to get around this problem is discussed in 



Sophisticated motion vector estimation might involve storage of adjacent macroblock motion vectors from 
above and below the lost macroblock, for predictions both forward and backward (for B-pictures) in time. 
The motion vectors from above and below (if available) could then be averaged. 

Less complex decoders could use, for example, only forward prediction and/or only the motion vector 
from the macroblock above the lost macroblock. This would save on storage and interpolation. 

D.13.1.1.3 Use of Intra MVs 

The motion compensated concealment technique outlined in D.13.1.1.2 could not ordinarily be applied 
^en the macroblocks above and below the lost/errored macroblock are Intra-coded, since there is no 
motion vector associated with Intra-coded macroblocks. In particular, in I-pictures, this type of 
concealment would not be possible with the normal calculation and use of motion vectors. 



D.13.1 



Concealment possibilities 



D.13.1.1.3. 
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The encoding process can be extended to include motion vectors for intra macroblocks. Of course, the 
motion vector and coded information for a particular macroblock must be transmitted separately (e.g. in 
different pack^) so that the motion vector is still available in the event that the image data is lost 

Whra "concealment_motiQn_vectQrs" = 1, motion vectors are transmitted with Intra macroblocks, 
allowing improved concealment performance of the decoders. The concealment motion vector associated 
with an Intra-coded macroblock is intended to be used only for concealment (if necessary) of the 
macroblock located immediately below the Intra-coded macroblock. 

For simplicity, concealment motion vectors associated with Intra-coded macroblocks are always forward, 
and are considered as frame motion-vectors in Frame pictures and field motion-vectors in field pictures. 

Therefore, encoders that choose to generate concealment motion vectors should transmit, for a given 
Intra-coded macroblock, the frame- or field-motion vector that should be used to conceal (i.e. to predict, 
with forward frame- or field-based prediction respectively) the macroblock located immediately below the 
Intra-coded macroblock. 

Concealment motion vectors are intended primarily for I- and P-pictures, but the syntax allows their use 
in B-pictures. Concealment in B-pictures is not critical, since B-pictures are not used as predictors and so 
errors do not propagate to other pictures. Therefore, it may be wastefol to transmit concealment motion 
vectors in B-pictures. 

Concealment motion vectors transmitted with Intra macroblocks located in the bottom row of a picture 
cannot be used for concealment. However, if ''concealment_jnotion_vectors" = 1, those concealment 
motion vectors must be transmitted. Encoders can use the (0, 0) motion vector to minimise the coding 
overhead. 

When concealment motion vectors are used, it is a good idea to have one slice contain one row of 
macroblocks (or smaller), so that conceahnent can be limited to less than one row of macroblocks when a 
slice, or part of a slice, is lost This means that the loss of macroblocks in two successive rows is much less 
likely, and therefore the chances of achieving effective concealment using concealment motion vectors is 
improved. 

NOTE - when "concealment__motion_vectors" = 1, PMVs (Predictors for Motion Vectors) are NOT 
reset when an Intra macroblock is transmitted. Ordinarily, an Intra macroblock would reset 
the PMVs. 

D.13.1^ Spatial predictive concealment 

The generation of predicted, concealment macroblocks is also possible by interpolation from neighbouring 
macroblocks within the one picture (Annex G [17]). This is best suited to areas of high motion, v/berc 
temporal prediction is not successfiil, or as an alternative means of concealment for Intra macroblocks 
when concealment motion vectors ( D. 13. 1.1. 3) are not available. It also could be particularly useful for 
cell loss after scene changes. 

There are several possible approaches to spatial interpolation, and it could be carried out in the spatial or 
DCT domain, but normally it is only feasible and useful to predict the broad features of a lost macroblock, 
such as the DC coefiScient and perhaps the lowest AC coefficients. Spatial prediction of fine detail (bigh 
frequencies) is likely to be unsuccessful and is of little value in fast-moving pictures anyway. 

Spatial predictive macroblock concealment may also be useful in combination with layered coding 
methods (i.e. Data Partitioning or SNR scalability, see D. 13. 1.3). If in the event of cell loss some DCT 
coefficients in a macroblock are recovered from the lower layer, it is possible to use all information 
available (DCT coefficients recovered in the same macroblock from the lower layer and all DCT 
coefficients received in the adjacent macroblocks) for error concealment This is especially useful if the 
lower layer only contains DC coefficients due to bandwidth constraints. 
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D.13.1 3 Layered coding to facilitate concealment 

It is possible to assist the concealinent process further by arranging the coded video information such that 
the most important information is most likely to be received. The loss of the less important information 
can then be more effectively concealed. This approach can gain from use of a transmission medium or 
storage device with different priority levels (such as priority-controUed cell-based transmission in the B- 
ISDN, or v/hcTQ different error protection or correction is provided on different channels). The 
components produced by the coding process can be placed in a hierarchy of importance according to the 
effect of loss on the reconstructed image. By indicating the priority of bitstream components and treating 
the individual components with due importance, superior error concealment performance may be possible. 

Strategies available for producing hierarchically ordered bitstreams, or layers, include 

data partitioning - the coded macroblock data is partitioned into multiple layers such that partition zero 
contains address and control information and lower order DCT coefQdents, while partition one contains 
high frequency DCT coefBdents. 

SNR scalability - two sets of coefQdents are dequantised and then added together at the receiver before 
decoding. One set of coefficients could be a refinement of the quantisation error of the other, but other 
combinations (including an emulation of data partitioning) are possible. 

spatial scalability - the lower layer may be coded without regard for the enhancement layer, and could 
use other standard coding methods (ISO/IEC 11172-2 etc.). The enhancement layer contains the coded 
prediction error from a prediction based on the lower layer. 

temporal scalability - the enhancement layer defines additional pictures which, when remultiplexed with 
the base layer, provides a combined picture sequence of greater picture rate. 

These strategies produce layers v^ich, when added progressively, produce increasing quality of the 
reconstructed sequence. While some of these source coding techniques may result in a bitrate increase 
compared to the system without layering, the performance of the layered systems^ when subjected to 
channel errors, may be greater. 

Considering errcn: resilience alone, the hierarchically ord^ed layers should be handled with due quality, 
such that some fimction (such as picture quality for a given total bitrate) is optimised. The bitstream 
components may be treated differently at one or more of the following locations: 

• encoder - different channel coding might be used 

• chaimel - the channel may be able to provide different cell/packet loss probabilities or error 
characteristics to the different bitstream components. 

• decoder - error concealment could be performed differently within each bitstream 
D.13.13.1 Use of data partitioning 

Data partitioning allows a straightforward division of macroblock data into two layers. The PEP (Priority 
Break Point) pointer determines the contents of each layer. Ordinarily, data partition 0 contains the 
address and control information and the low frequency DCT coefBcients, while data partition 1 contains 
the high frequency DCT coefBdents. 

At the encoder the value of the PBP pointer may be difierent for each slice such that the distribution of 
bits between the two layers may be controlled (e.g. maintained constant). The distribution may be different 
for I, P, and B frames. The management of rate between the layers could mean that, for some 
macroblocks, data partition 0 contains no DCT coefQdents or motion vectors. 

Good tolerance to errors can be adiieved if channel errors are distributed so that data partition 1 recdves 
most errors. 
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It is assumed that errors can be detected at the decoder, so that actions can be taken to prevent errored 
data from being displayed. For data partition 1, errored data is simply not displayed (i.e. only data 
partition 0 is used). Losses or errors in data partition 0 should be minimised through use of high reliability 
transport. Decoder concealment actions may also be necessary. 

D.13.13.2 Use of SNR scalable coding 

SNR scalable coding provides two layers with the same spatial resolution but different image quality, 
depending on whether one or both layers are decoded. This technique is mainly intraded to provide a 
lower-quality layer that is usable even when the enhancement layer is absent However, it also provides 
good error resilience if the errors can be mainly confined to the enhancement layer. 

In case of errors in the enhancement layer the lower layer can be used alone for the affected image area. 
Especially in the case of fi'cquent errors, temporary loss or pomanent unavailability of the enhanconent 
layer this concealment is very effective, since the displayed signal can be made relatively fi'ee of non- 
linear distortions like blocking or motion jerkiness. 

If the enhancement layer is permanently imavailable and so only the lower layer is decoded, a small drift 
may occur in the case w^ere only one MC prediction loop is implemented in the encoder. However, this 
drift is likely to be invisible in most configurations (e.g. M=3, N=12 would normally provide COTrectiQn 
often enough). 

The lower-lay^ of an SNR Scalable system is well suited to concealment in the case of a very high error 
rate, temporary or permanent loss of the enhancement-layer signal. However, the enhancement-layer 
quality in the error-free case does not achieve that of a sub-band like layered scheme (e.g. data 
partitioning). 

D.13.13.3 Use of spatial scalable coding 

Spatial scalable coding allows the lower layer to be coded without regard for the enhancement layer, and 
other standard coding methods (ISO/IEC 1 1 172-2 etc.) could be used. The enhancement layer contains the 
coded prediction error from a prediction based on the lower layer. In case of errors in the enhancement 
layer the upconverted lower layer can be used directly as concealment information for the affected image 
area. Especially in case of fi^equent errors or temporary loss of the enhancement layer this concealment 
data is relatively free of non-linear distortions like blocking (which could arise if high frequency DCT 
coefiQcients are completely absent from the lower layer) or motion jerkiness (if the motion information is 
omitted from the high priority layer). 

In the error-free case the upconverted lower layer is used as an additional source of predictions in a 
macroblock-adaptive way to improve the enhancement-layer coding performance. The enhancement layer 
bitstream therefore consists of the quantised temporal or lower layer prediction errors. 

Spatial scalable coding provides a lower layer that is very suitable for concealment in case of a high error 
rate or temporary loss of the enhancement layer. However, the quality of the enhanced picture >^en both 
layers are available will not, in general, be as good as other layered coding approaches. 

D.13.U.4 Use of temporal scalable coding 

Temporal scalability is a coding technique that allows layering of video frames. The spatial resolution of 
frames in each layer is the same but the temporal rates of each layer are lower than that of the source; 
however the combined temporal rate of the two layers results in full temporal rate of the source. In case of 
errors in the enhancement layer, the base layer of full spatial resolution can be easily used for 
concealment. Especially in case of frequent errors or temporary loss of the enhancement layer, the base 
layer offers good concealment properties. 
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In some telecommunications applications a high degree of error resilience might be achieved with 
temporal scalability by encoding the base layer iising the same spatial resolution but only half the 
temporal resolution of the source; the remaining frames corresponding to the other half of the temporal 
resolution are coded in the enhanc»nent layer. Typically, the enhancement layo- data may be assigned 
lower priority and when lost, the base layer decoded frames can be used for concealment by fimne 
r^etition. This type of concealment leads to only a temporary loss of fiill temporal resolution while 
maintaining frill spatial quality and frill spatial resolution. 

In HDTV applications such as those using high temporal resolution progressive video fr>rmat as source, 
high degree of error resilience can be achieved with temporal scalability. Such an application is envisaged 
to require 2 layers, a base layer and an enhancement layer, each of \^ich process same picture formats 
(either both progressive or both interlaced) but at half the temporal rates. Temporal remultiplexing of the 
base and enhancement layers irrespective of their chosen formats always results in frill progressive 
temporal resolution of the source. In HDTV transmission, if the lower priority enhancement layer is 
corrupted, the base layer can be used for concealment, either directly, as in case of progressive format base 
layer or after reversal of parity of fields for interlaced format base layer. 

Typically, the enhancement layer data may be assigned low^ priority and v/hca lost, the base layer 
decoded fi'ames can be used far concealment by either frame repetition or frume averaging. This type of 
conceahnent leads to only a temporary indistinguishable loss in temporal resolution while maintaining frill 
spatial quality and frill spatial resolution. 

D.13.2 Spatial localisation 

Spatial localisation encompasses those methods aimed at minimising the extent to \^ich errors propagate 
within a picture, by providing early resynchronisation of the elements in the bitstream that are coded 
differentially between macroblocks. 

Isolated bit errors may be detected through invalid codewords and so a decoder designer may choose to 
allow an errored sequence to be decoded. However, the effect on the picture is difficult to predict (legal, 
but incorrect, codewords could be generated) and it may be preferable to control the error through 
concealment of the entire affected slice(s) even when only one bit is known to be in error somewhere in a 
block of data. 

When long consecutive errors occur (e.g. packet or cell loss), virtually the only option is to discard data 
imtil the next resynchronisation point is located (a start code at the next slice or picture header). By 
providing more resynchronisation points, the area of the screen affected by a loss or error can be reduced, 
in turn reducing the demands on die concealment techniques and making the errors less visible at the 
expense of coding efficiency. Spatial localisation of errors is therefore dependent on controlling the slice 
size since this is the smallest coded unit with resynchronisation points (start codes). 

D.13.2.1 Small slices 

The most basic method far achieving spatial localisation of errors is to reduce the (fixed) number of 
macroblocks in a slice. The increased frequency of resynchronisation points will reduce the affected 
picture area in the event of a loss. It is effective ia any transport or storage media, and in any profile since 
the slice structure is always present in MPEG coded video. 

The method results in a small loss of coding efQciency due to the increase of overhead information. The 
loss is about 3% for 11 Macroblocks per slice and 12% for 4 Macroblocks per slice based on 
Recommendation mj-RBT.601 picture format at 4 M1d/s, (percentages calculated relative to a system 
usiag 44 Macroblocks, or one picture width, per slice). The efQciency loss results in degradation of picture 
quality up to about 1 dB with 4 Macroblocks per slice and 0,2 dB with 1 1 Mao-oblocks per slice without 

errors at 4 Mb/s. However, the method performs approximately 1 to 5 dB better at CLR = 10"^, depending 
on the concealment method used (simple macroblock replacement or motion compensated concealment). 
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From the view point of perceived picture quality, the performance of this method is generally dqjendent 
on the relative size of slice size and picture. Therefore^ the slice size should be decided by considering the 
picture size (in macroblocks) and the trade-off between coding efficiency and visual degradation due to 
errors. 

D.13.2.2 Adaptive slice size 

There is a significant variation in the number of bits required to code a picture slice, depending on the 
coding mode, picture activity, etc. If slices contain only a few macroblocks, it will be possible that one 
transport packet, even a short packet or cell, could contain several slices. Offering multiple 
resynchronisation points in the same transport packet serves no purpose. Another problem with the 
simplistic short slice approach is that, because no account is taken of the transport packet structure, the 
first valid transport packet after a loss could contain most of the information for a slice, but it is unusable 
because the start code was lost. 

An improvement over the small slice method may be to use adaptive slice sizes. As the encodo* is 
producing the bitstream, it keeps track of the data contents within transport padcets. The start of a slice is 
placed at the first opportunity in every transport packet (or in every second, third, ...). This approach can 
achieve about the same spatial localisation of errors as small, fixed size slices, but with a greater 
efficiency. 

However, this method ONLY gives an advantage for cell or packet based transmission, or where error 
detection occurs over a large block of data. The fi'equent resyachronisation points of small slice 
localisation are only wastefiil if more than one is lost in the event of an error. If isolated bit errors afiect 
just one slice anyway, then there is no advantage in adapting the slice size. 

Fvirthermore, the adaptive slice size technique requires an intimate connection between encoder and 
packetiser, to allow a new slice for a new packet or cell. As such, it may not be appropriate for some 
applications (e.g. stored video intended to be distributed by multiple means) because only one transport 
packet structure would be assumed during encoding. 

D.133 Temporal localisation 

Temporal localisation encompasses those methods aimed at minimising the extent to which errors 
propagate fi*om picture to picture in the temporal sequence, by providing early resynchronisation of 
pictures that are coded differentially. An obvious way to do this is to make use of intra mode coding. 

D.13.3.1 Intra pictures 

By use of intra pictures a single error will not stay in the decoded picture longer than (N + M -1) pictures 
if every Nth picture is coded intra and (M-1) B pictures are displayed before each 1 picture. 

While the intra pictures, normally used as "anchors** for synchronising the video decoding part way 
through a sequence, are usefiil for temporal localisation, care should be taken in adding extra intra 
pictures (i.e. reducing N) for error resilience. Intra pictures require a large number of bits to code, take \sp 
a relatively large proportion of the encoded bitstream and, as a result, are more likely to be afiected 
losses or errors themselves. 

D.133.2 Intra slices 

To avoid the additional delay caused by intra pictures, some applications requiring low delay may want to 
update the picture by coding only parts of the picture intra. This may provide the same kind of error 
resilience as intra pictures. As an example assume that a constant number of slices per picture from top to 
bottom are intra coded so that the whole picture is updated every P pictures. Three aspects of this kind of 
updating should be kept in mind: 
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• While an enrored portioii of the scene will ordinarily be erased within P pictures (with an average 
duration of about P/2X it is possible that motion compensation will allow the disturbance to bypass the 
intra refresh and it may persist as long as 2P pictures. 

• To ensure that errors are not propagating into the updated region of the picture, restrictions could 
be put on motion vectors, limiting the vertical vector components to ensure that predictions are not made 
from the "oldest" parts of the picture. 

• The visual effect of clearing errors can be similar to a windscreen wiper clearing water. This 
windscreen wiper effect can become noticeable in some cases in the error free sequence, unless the rate 
control mechanism ensures that the quality of the intra slice is close to that of the surrounding non-intra 
macroblocks. 

DJ3.4 Summary 

Table D-3 sununarises the above error resilience techniques, with a guide to their applicability. 



Table D-3. Summary of error concealment techniques. 



Category 


Technique 


Profile/Applicability 


Concealmoit 


Temporal predictive - sub- 
stitution from previous picture 


Any profile. Most suited to static pictures. 




Temporal predictive - Motion 
compensated 


Any profile. Choice of sophistication in motion vector 
estunanon. 




Temporal predictive - using 
concealment MVs 


Any profile, but calculation of Intra MVs is an encoder 
option. 




Spatial predictive 


Any profile. Not suitable for static, complex pictures. 




Data Partitioning 


Not currently used in a profile, but may be added as 
post/pre-processing. Minimal overhead and complexity. 
Depending on bitrate allocation, lower layer may not 
provide usable pictures by itself. 




SNR Scalability 


SNR SCALABLE, SPATL\LLY SCALABLE, fflGH 
profiles. Suitable for very high error rates or t^porary 
imavailability of the enhancement layer. Relatively 
simple to implement. 




Spatial Scalability 


SPATIALLY SCALABLE and fflGH profiles. Suitable 
for very high otot rates or temporary unavailability of 
the enhanconent layer. 




Temporal Scalability 


Not currently used in a profile. Suitable for very high 
error rates or temporary unavailability of the 
enhancement layer. 


Spatial Localisation 


Small Slices 


Any profile 




Adaptive slice sizes 


Any profile, but requires knowledge of transmission 
characteristics when packet size is decided. 


Temporal Localisation 


Intra pictures 


Any profile, but has delay implications. 




hitra slices 


Any profile, but errors may persist longer than for Intra 
picture method. 
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It is not possible to provide a concise indication of error resilience poibrmance, because assessments must 
necessarily be subjective and application dependent, and so should be taken as nothing more than a guide. 
It is also true that several different approaches to mar resilience are likely to be used in combination. 
However, the following descriptions are provided as some guidance to performance. They are the results 
of cell loss experiments, looking only at cell-based transmission of video information. 

A simple macroblock substitution from a previous frame combined with the small-slice method (4 
macroblocks per slice) will provide adequate picture quality for most sequences in the presence of rather 

low error rates of around CLR = 10'^ (in a reference 4 Mbit/s, Main Profile, Main Level syst^). 

Including sophisticated motion compensated concealment (with full spatial and temporal interpolation of 
motion vectors for lost macroblocks, and concealing losses in P pictures that use intra slice updating, i.e. 

N= infinity, M=l) provides adequate picture quality at CLR = 10'^ (again, in a reference 4 Mbit/s, Main 
Profile, Main Level system). 

Operation in environments with greater loss may require use of one of the layered coding methods. With 
adequate protection of the high priority information, these schemes can provide adequate performance in 

the face of CLRs as high as 10"^ or even 10*^. Data partitioning, implemented as a post-processing 
function to a 4 Mbit/s Main Profile, Main Level system, with 50% of the rate allocated to each partition 
and no loss in the base layer, has been shown in one example to give approximately 0,5 dB loss in SNR at 

a CLR of 10'^, about 1,5 dB loss at 10~^, and with almost no visible degradation in either case. 

Given the range of different layered coding approaches that are possible, some general comments may be 
useful. In general, it is not expected that inclusion of the most complex layered coding methods could be 
justified purely on the basis of error resilience. Instead, they could be utilised for error resilience if they 
were required to satisfy other system requirements. Data partitioning is very simple to implement and is 
likely to provide error resilience very nearly the same as any of the other methods except in the case of 
extremely high error rates (>10% loss) or where the enhancement layor could be lost completely. SNR 
scalability is slightly more complex, and has slightly lower efQciency than data partitioning, bat it is 
easier to produce lower layers of a usable quality >^en the enhancement layer is absent. Spatial scalability 
is more complex again, but provides a good lower layer picture quality at the expense of overall (two 
layer) efficiency. 

D.14 Concatenated sequences 

Sequence concatenation occurs when an elementary stream contains a sequence ending with a 
sequence_end_code that is followed by another sequence starting with a sequence_start_code. Any 
parameter including but not limited to profile, level, VBV buffer size, frame rate, horizontal size, vertical 
size, or bitrate, which is not allowed to change within a single sequence may change from sequence to 
sequence. 

The behaviour of the decoding process and display process for concatenated sequences is not within the 
scope of this standard. An application that needs to use concatenated sequences must ensure by private 
arrangement that the decoder will be able to decode and play concatenated sequences. 

Applications should ensure that decoders will have an acceptable behaviour when parameters change. For 
example changes to the 

Frame size 

Frame rate 

Field parity of the first displayed field of the new sequence versus the field parity of the final 
displayed field of the previous sequence. 

Buffer status 
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Annex £ 



Profile and level restrictions 



(This annex does not form an integral part of this Recommendation | International Standard) 



E.1 



Syntax element restrictions in proffles 



This Clause tabulates all of the syntactic elements defined in this Specification. Each is classified to 
indicate whether it is required to be supported by a decoder compliant to a particular profile and level. 
Normative specifications for compliance are given in ISO/IEC 13818-4. 

NOTE - This Gause is informative and is simply intended as a summary of the normative restrictions 
set out in Clause 8. If, because of an error in the preparation of this text, a discrepancy exists 
between Clause 8 and Annex E the normative text in Gause 8 shall always take precedence. 

In the tables that follow a number of abbreviations are used as shown in Table E-1 . 



Table £-1. Abbreviations used in the Tables of Oause E 



Abbreviation Used in Meaning 



X Status must be supported by the decoder 

0 Status need not be supported by the decoder 
D Type item with Level-dependent parameters 

1 Type item independent of the Level in the Profile 

P Type item for post-processing after decoding; the decoder must 




NOTE - "Status" is kept blank if an entry is not a syntactic element 
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Table E-2. Sequence header 





Status 


Type 






IGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Svntactic elements 














Comments 


01 


iionzontai_size_vaiue 


X 


X 


m^ 

X 


X 


A 


LJ 


see 1 aoie o- / 


02 


verDcai_size_vaiue 


X 


X 


m^ 

X 


m^ 

X 


X 


D 


see Table 8-7 


03 


aspect_ratio_Jnforiiiation 


X 


mw 

X 


^m 

X 


mr 

X 


X 


n 
r 




04 


iram e_raie_c 00 e 


A 


X 


X 


X 


X 


U 


see 1 aDies o- / ana o-o 


05 


(pel rate) 

NOTE - this is not a syntactic element 












U 


see I aDie o-o, pel rate is a 
product of pels/line, lines/frame 
and frames/sec 


06 


bit_rate_valne 


X 


X 


X 


X 


X 


D 


see Table 8-9 


07 


vbv_buffer_size_valiie 


X 


X 


X 


X 


X 


D 


see Table 8-10 


08 


co]istramed_parameters_fl^ 


X 


X 


X 


X 


X 


I 


set to if MPEG-1 constrained, 

set to '0' if MPEG-2 


09 


loadL.intra_quantiser.matrix 


X 


X 


X 


X 


X 


I 




10 


intra_qiiantiser_matrix[64] 


X 


X 


X 


X 


X 


I 




11 


load_nonL.intra_quantiser.matriz 


X 


X 


X 


X 


X 


I 




12 


non_mtra_qiiantiser_matrix[64] 


X 


X 


X 


X 


X 


I 




13 


sequence.extensicmO 


X 


X 


X 


X 


X 


I 


always present if MPEG-2 


14 


sequaice^display^extensionO 


X 


X 


X 


X 


X 


P 




15 


sequence.scalable.extensionO 


0 


0 


X 


X 


X 


I 


see Table 8-11 for maximum 
number of scalable layers 


16 


user_dataO 


X 


X 


X 


X 


X 


1 


decoder may skip this data 
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Table £-3. Sequence extension 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 
















Syntactic elements 














Comments 


01 


prolile_and_level_indicatio]i 


X 


X 


X 


X 


X 

■ 


D 


profile: one of 8 values 
level: one of 16 values 
escape bit: ooe of 2 values 


02 


progressive.sequence 


X 


X 


X 


X 


X 


I 




03 


chronia_fonnat 


X 


X 


X 


X 


X 


I 


see Table 8-5 


04 


horizontal^size.extension 


X 


X 


X 


X 


X 


D 


input picture size related 


05 


vertical_size_extension 


X 


X 


X 


X 


X 


D 


input picture size related 


06 


bit_rate_extension 


X 


X 


X 


X 


X 


D 


input picture size related 


07 


vfov_bu£rcr_size_extension 


X 


X 


X 


X 


X 


D 


input picture size related 


08 


low.delay 


X 


X 


X 


X 


X 


I 




09 


frame_rate_extension_n 


X 


X 


X 


X 


X 


I 


set to 0 for all defined profiles 


10 


frame_rate_extension_d 


X 


X 


X 


X 


X 


I 


set to 0 for all defined profiles 



Table £-4. Sequence display extension elements 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


video.format 


X 


X 


X 


X 


X 


P 




02 


colour.description 


X 


X 


X 


X 


X 


P 


input format related 


03 


colour_primaries 


X 


X 


X 


X 


X 


P 




04 


transfer.cbaracteristics 


X 


X 


X 


X 


X 


P 




05 


matrix_coefBcients 


X 


X 


X 


X 


X 


P 




06 


display.horizontal^size 


X 


X 


X 


X 


X 


P 


input format related 


07 


display_vertica|_size 


X 


X 


X 


X 


X 


P 


input format related 
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Table £-5. Sequence scalable extension 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPL£ 














# 


Syntactic elements 














Comments 


01 


scalable.mode 


0 


u 


X 




X 


I 


SNR Profile: SNR Scalability 

Spatial and High Profile: SNR or 
Spatial Scalability 


02 


layer id 


o 


0 


X 


X 


X 


I 






iff spatial scalable) 
















03 


lower layer Drediction 
horizontal_size 


0 


0 


0 


X 


X 


D 


see table 8-8 for limiinance 
sampling density 


04 


lower layer Drediction^ 
vertical_size 


0 


o 


o 


X 


X 


D 


see table 8-8 for luminance 
sampling density 


05 


horizontaLsubsampling. 
factor.m 


0 


0 


0 


X 


X 


I 




06 


horizontal_subsanipling_ 
factor^n 


0 


0 


0 


X 


X 


I 




07 


vertical_subsampling. 
factor.m 


o 


o 


0 


X 


X 


I 




08 


vertical_subsampling_ 
factor.n 


0 


0 


0 


X 


X 


I 






if(temporal scalable) 
















09 


picture_niui_enable 


0 


0 


o 


0 


0 


I 




10 


niux_tojprogressive_sequence 


0 


0 


0 


o 


0 


I 




11 


picture_mux_order 


0 


0 


o 


0 


0 


I 




12 


picture.mux_factor 


0 


0 


0 


0 


0 


I 
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Table E^. Group of pictures header 





Status 


Type 




mcH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


time_code 


X 


X 


X 


X 


X 


I 




02 


closed^gop 


X 


X 


X 


X 


X 


I 


1 „ , _ — 


03 


broken_lmk 


X 


X 


X 


X 


X 


I 





Table Picture header 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 
















Syntactic elements 














Conunents 


01 


temporal.reference 


X 


X 


X 


X 


X 


I 




02 


picture_coding_type 


X 


X 


X 


X 


X 


I 


Simple Profile: 1, P at Main level, 
I, P, B at Low level 

Main, SNR, Spatial & High 
Profile: I, P, B 


03 


vbv_delay 


X 


X 


X 


X 


X 






04 


fiill_pel_forward_vector 


X 


X 


X 


X 


X 




•0' for MPEG-2 


05 


forward_f_code 


X 


X 


X 


X 


X 




'lir fQrMPEG-2 


06 


lull _pel_backward_vector 


X 


X 


X 


X 


X 




'0' for MPEG-2 


07 


backward_f_code 


X 


X 


X 


X 


X 




'lir for MPEG-2 


08 


extra_information_j)icture 


X 


X 


X 


X 


X 






09 


picture_coding_extensionO 


X 


X 


X 


X 


X 






10 


quant_matnx_extensionO 


X 


X 


X 


X 


X 






11 


picture_display_extensionO 


X 


X 


X 


X 


X 


p 




12 


picture_spatial_scalable_extensionO 


0 


0 


0 


X 


X 






13 


picture_temporal_scalable_extensionO 


0 


0 


o 


o 


0 
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Table £-8. Picture coding extension 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


f_code[Oj[0] (forward horizontal) 


X 


X 


X 


X 


X 


LJ 


Main Level [1:8] 

Hieh-1440 & Hi&h Level ri*91 


02 


i_coae[Uj|ij (lorwara verncaij 


X 


X 


X 


X 


X 


D 


Low Level [1:4] 

Main, High'1440 & High Level 
[1:5] 


03 


f codefllfO] (backward horizontal) 

m 


X 


X 


X 


X 


X 


D 


Low Level [1 :7] 

Main Level [1:8] 

High-1440 & High Level [1:9] 


04 


f.code[l][l] (backward vertical ) 


X 


X 


X 


X 


X 


D 


Low level [1:4] 

Main, H-14 & High Level [1:5] 


05 


intra_dc_precision 


X 


X 


X 


X 


X 


I 


Simple, Main, SNR & Spatial 
Profile: [8:10] 

High Profile: [8:11] 


06 


picture_structure 


X 


X 


X 


X 


X 


I 




07 


top_fieId_first 


X 


X 


X 


X 


X 


I 




08. 


franie_pred_fran[ie_dct 


X 


X 


X 


X 


X 


I 




09 


concealment_motion_vectors 


X 


X 


X 


X 


X 


I 




10 


q_scale_typc 


X 


X 


X 


X 


X 


I 




11 


intra_vlc_format 


X 


X 


X 


X 


X 


I 




12 


altemate^scan 


X 


X 


X 


X 


X 


I 




13 


repeat_first_field 


X 


X 


X 


X 


X 


I 


■ 


14 


chroma_420_type 


X 


X 


X 


X 


X 


p 




15 


progressive_franie 


X 


X 


X 


X 


X 


p 




16 


composite_display_llag 


X 


X 


X 


X 


X 


p 




17 


v_aris 


X 


X 


X 


X 


X 


p 




18 


Geld_sequence 


X 


X 


X 


X 


X 


p 




19 


sub.carrier 


X 


X 


X 


X 


X 


p 




20 


burst_amplitude 


X 


X 


X 


X 


X 


p 




21 


sub_carrier_pha$e 


X 


X 


X 


X 


X 


p 
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Table £-9. Quant matrix extension 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














tr 


Syntactic elements 














Comments 


01 


load_intra_quantiser_matrix 


X 


X 


X 


X 


X 


I 




02 


intra_quantiser_matrix[64] 


X 


X 


X 


X 


X 


I 




03 


load_nonJntra_quantiser_niatrix 


X 


X 


X 


X 


X 


I 




04 


non_intra_quantiser_ 
matrix[64] 


X 


X 


X 


x 


X 


I 




05 


load_chronuiJntra_quantiser_ 
matrix 


0 


o 


0 


0 


X 


I 




06 


clironia_intra_quantiser_ 
niatrix[64] 


0 


o 


0 


0 


X 


I 




07 


load_chroma_non_intra_ 
quantiser^matrix 


0 


o 


o 


0 


X 


I 




08 


cluroma_non_intra_quantiser_ 
matrix[64] 


0 


0 


0 


0 


X 


I 





Table £-10* Picture display extension. 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


frame_centre_horizontal_ofrset 


X 


X 


X 


X 


X 


P 


input format related 


02 


frame_centre_vertical_oirset 


X 


X 


X 


X 


X 


P 


input format related 
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Table £-11. Picture temporal scalable extension 



# 


Status 


Type 




[GH 






SPATIAL 


SNR 




MAIN 




SIMPLE 




Syntactic elements 




Comments 


01 


reference_select_code 


0 


0 


0 


o 


0 


I 


- 


02 


forward_temporal_reference 


0 


0 


0 


o 


0 


I 




03 


backward_temporal_reference 


o 


0 


0 


o 


0 


I 





Table E-12. Picture spatial scalable extension 





Status 


Type 




mGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


lower_layer_temporal_reference 


o 


0 


0 


X 


X 


I 




02 


lowerJayer_horizontal_ofrset 


o 


0 


o 


X 


X 


D 


input format related 


03 


lower_layer_vertical_offset 


0 


0 


0 


X 


X 


D 


input fonnat related 


04 


spatiaLtemporal„weight_code_ 
table^index 


0 


0 


0 


X 


X 


1 




05 


lowerJayer_progressive_frame 


0 


0 


0 


X 


X 


I 




06 


lowerJayer_deinterlaced_field_ 
select 


o 


0 


0 


X 


X 


I 





Recommendation TTU-T H.262 (1995 E) 



209 



ISO/IEC 13818-2: 1995 (£) 



Table E-13. Slice layer 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 
















Syntactic elements 














Comments 


01 


slice„verticaljM)sition_ 
extension 


X 


X 


X 


X 


X 


D 


input format related 


02 


priority_breakpoint 


0 


0 


0 


0 


0 


I 


only required for data 
partitioning 


03 


quantlser_scale_code 


X 


X 


X 


X 


X 


I 




04 


intra_slice 


X 


X 


X 


X 


X 


I 




05 


extraJnformation_slice 


X 


X 


X 


X 


X 


I 


decoder may skip this data 


06 


macroblockO 


X 


X 


X 


X 


X 


I 





Table £-14. Macroblock layer 





Status 


Type 




HIGH 








SPATLiL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


macroblock_escape 


X 


X 


X 


X 


X 






02 


macroblock_addressJncrement 


X 


X 


X 


X 


X 






03 


macroblock;^odesO 


X 


X 


X 


X 


X 






04 


quantiser_scale_code 


X 


X 


X 


X 


X 






05 


motion_vectQrs(0) 


X 


X 


X 


X 


X 




forward motion vector 


06 


motiQn_vectors( 1 ) 


o 


X 


X 


X 


X 




backward motion vector 


07 


coded_block_j)attemO 


X 


X 


X 


X 


X 






08 


block(i) 


X 


X 


X 


X 


X 
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Table £-15. Macroblock modes 





Status 


Type 




HIGH 








SPATIAL 






• 




SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


macroblock_type 


X 


X 


X 


X 


X 


I 




02 


spatia].temporal__weigbt_code 


o 


0 


0 


X 


X 


I 




03 


frame_motion_type 


X 


X 


X 


X 


X 


I 


01: Field-based prediction 
10: Frame-based prediction 
11: Dual-prime 


04 


Geld_motion_type 


X 


X 


X 


X 


X 


I 


01: Field-based prediction 
10: 16x8 MC 
11: Dual-prime 


05 


dctjtype 


X 


X 


X 


X 


X 


I 





Table E-16. Motion vectors 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


motion_vertical_field_select 


X 


X 


X 


X 


X 


I 




02 


motion_vectorO 


X 


X 


X 


X 


X 


I 
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Table £-17. Motion vector 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


niotion_horizontal_code 


X 


X 


X 


X 


X 






02 


motion_horizontal_r 


X 


X 


X 


X 


X 






03 


dmy_horizontaI 


X 


X 


X 


X 


X 






04 


motion_vertical_code 


X 


X 


X 


X 


X 






05 


niotion_vertical_r 


X 


X 


X 


X 


X 






06 


dmv.vertical 


X 


X 


X 


X 


X 







Table E-18. Coded Block Pattern 





Status 


Type 




HIGH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


coded_blocK_pattem_420 


X 


X 


X 


X 


X 


I 




02 


coded_block_pattem_l 


0 


0 


0 


o 


X 


I 


4:2:2 


03 


coded_block_pattem_2 


0 


0 


o 


o 


0 


I 


4:4:4 
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Table E-19. Block layer 





Status 


Type 




mcH 








SPATIAL 










SNR 












MAIN 














SIMPLE 














# 


Syntactic elements 














Comments 


01 


DCT coefBcients 


X 


X 


X 


X 


X 


I 




02 


End of block 


X 


X 


X 


X 


X 


I 





£.2 Permissible layer combinations 

The following tables illustrate the parameter limits that may be applied in each layer of a bitstream, and 
the corresponding appropriate profile_and_leveLindication that should be used. Eadi table describes the 
limits of a single compliance point in the profile / level matrix. 

The following notation has been adopted: 

<profile abbreviatiQn>@<level abbreviatiQn> 

The abbreviations are defined in table £-.20 



Table E-20. Abbreviations for profile and level names 



Profile 


<profile 
abbreviation^ 


Level 


<lcvcl 

abbreviation^ 


Simple 


SP 


Low 


LL 


Main 


MP 


Main 


ML 


SNR Scalable 


SNR 


High-1440 


H-14 


Spatially Scalable 


Spt 


High 


HL 


High 


HP 






ISO/IEC 1 1 172-1 constrained parameters 


ISO 11172 



Table E-21. Simple profile @ Main level 



No. of 
layers 


layer 
id 


Scalable 
mode 


Maximum 
sample 

density 

(H/y/F) 


Maximum 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


1 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 
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Table E-22. Main profile @ Low level 



No. of 
layers 


layer 
id 


Scalable 
mode 


Maximum 
sample 
density 

(WVIF) 


Maximum 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


1 


0 


Base 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 


Table £-23. Main profile @ Main level 


No. of 
layers 


layer 
id 


Scalable 
mode 


Maximum 
sample 
density 

(H/V/F) 


Maximum 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
bufiier 


Profile 
and level 
indication 


1 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 


Table £-24. Main profile @ High-1440 level 


No. of 
layers 


layer 
id 


Scalable 
mode 


Maximum 
sample 
density 

(H/V/F) 


Maximum 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


1 


0 


Base 


1440/1152/60 


47 001 600 


60 


7 340 032 


MP@H-14 


Table £-25. Main profile @ High level 


No. of 
layers 


layer 
id 


Scalable 
mode 


Maximum 
sample 
density 

(H/V/F) 


Maximum 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


1 


0 


Base 


1920/1152/60 


62 668 800 


80 


9 781 248 


MP@HL 
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Table E-26. SNR profae @ Low level 



No. of 
layers 



layer 
id 



Scalable 
mode 



Maximum 
sample 
density 

(H/y/F) 



Maximum 
sample 
rate 



Maximum 
total bit 

rate 
/lOOOOOO 



Maximum 
total VBV 
buffer 



Profile 
and level 
indication 



0 
1 



Base 
SNR 



352/288/30 
352/288/30 



2 534 400 
2 534 400 



1.856 
4 



327 680 
475 136 



ISO 11172 
SNR@LL 



0 
1 



Base 
SNR 



352/288/30 
352/288/30 



3 041 280 
3 041 280 



3 
4 



360 448 
475 136 



SP@ML 
SNR@LL 



0 

1 



Base 

SNR 



352/288/30 

352/288/30 



3 041 280 

3 041 280 



3 

4 



360448 

475 136 



MP@LL 

SNR@LL 



Table £-27. SNR profile @ Main level 



No. of 
layers 


layer 
id 


Scalable 
mode 


MflYimum 

sample 
density 

(H/V/F) 


Maximum 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


2 


0 


Base 


720/576/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


SNR 


720/576/30 


2 534 400 


15 


1 835 008 


SNR@ML 


2 


0 


Base 


720/576/30 


10 368 000 


10 


1212 416 


SP@ML 




1 


SNR 


720/576/30 


10 368 000 


15 


1 835 008 


SNR@ML 


2 


0 


Base 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


SNR 


352/288/30 


3 041 280 


15 


1 835 008 


SNR@ML 


2 


0 


Base 


720/576/30 


10 368 000 


10 


1212 416 


MP@ML 




1 


SNR 


720/576/30 


10 368 000 


15 


1 835 008 


SNR@ML 
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Table £-28. Spatial profile @ High-1440 level (Base Layer + SNR) 



layers 


lajr CI 

id 


mode 


]V1q YimilTTI 

sample 
density 

(HfVfF) 


IV^dTiTTiTirn 

sample 
rate 


I^ATIITIIITTl 

total bit 

rate 
/lOOOOOO 


iTAttAJILIIllll 

total VBV 
buffer 


and level 
indication 


2 


0 


Base 


352/288/30 


2 534 400 


L856 


327 680 


ISO 11172 




1 


SNR 


352/288/30 


2 534 400 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


SNR 


720/576/30 


10 368 000 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


352/288/30 


3 041 280 


4 


475 136 


MP@TT, 




1 


SNR 


352/288/30 


3 041 280 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


SNR 


720/576/30 


10 368 000 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


1440/1152/60 


47 001 600 


40 


4 882 432 


MP@H-14 




1 


SNR 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 


Table £-29. Spatial profile @ High-1440 level (Base Layer + Spatial) 


No. of 
layers 

* 


layer 
id 


Scalable 
mode 


Maximum 
sample 
density 

(H/V/F) 


Maximum 
sample 
rate 


Maximum 
total bit 
rate . 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


2 


0 


Base 


768/576/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


2 


0 


Base 


1440/1152/60 


47 001 600 


40 


4 882 432 


MP@H-14 




1 


Spatial 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 
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Table £-30. Spatial profile @ High-1440 level (Base Layer + SNR + Spatial) 



INO. 01 


layer 
id 


dcaiaDJe 








Maximum 


Profile 


Maximum 


Maximum 


Maximum 


layers 


mode 


sample 
density 

(WVfF) 


sample 
rate 


total bit 

rate 
71000000 


total VBV 
bulfer 


and level 
indication 


3 


0 


Base 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


SNR 


352/288/30 


2 534 400 


4 


475 136 


SNR@LL 




2 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


352/288/30 


3 041 280 


3 


360 448 


SP@ML 




1 


SNR 


352/288/30 


3 041 280 


4 


475 136 


SNR@LL 




2 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


352/288/30 


3 041 280 


3 


360 448 


MP@LL 




1 


SNR 


352/288/30 


3 041 280 


4 


475 136 


SNR@LL 




2 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


720/576/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


SNR 


720/576/30 


2 534 400 


15 


1 835 008 


SNR@ML 




2 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


720/576/30 


10 368 000 


10 


1212 416 


SP@ML 




1 


SNR 


720/576/30 


10 368 000 


15 


1 835 008 


SNR@ML 




2 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


SNR 


352/288/30 


3 041 280 


15 


1 835 008 


SNR@ML 




2 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


720/576/30 


10 368 000 


10 


1212 416 


MP@ML 




1 


SNR 


720/576/30 


10 368 000 


15 


1 835 008 


SNR@ML 




2 


Spatial 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


1440/1152/60 


10 368 000 


15 


1 835 008 


MP@H-14 




1 


SNR 


1440/1152/60 


10 368 000 


40 


4 882 432 


Spt@H-14 




2 


Spatial 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 
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Table £-31. Spatial proffie @ High-1440 level (Base Layer + Spatial + SNR) 



No. of 


layer 
id 


Scalable 

mofie 


Ik. M * 

Maxuniun 

cnmnle 

iTfl llllf ITT' 

density 
(HA^/F) 


Maximum 
rate 


Maximum 
total hit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 

find IpvpI 

11 1 111 ICTCA 

indication 


3 


0 


Base 


768/576/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


1440/1152/30 


47 001 600 


40 


4 882 432 


Spt@H-14 




2 


SNR 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


1440/1152/30 


47 001 600 


40 


4 882 432 


Spt@H-14 

A ^^^^ 




2 


SNR 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


1440/1152/30 


47 001 600 


40 


4 882 432 


Spt@H-14 




2 


SNR 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


1440/1152/30 


47 001 600 


40 


4 882 432 


Spt@H-14 




2 


SNR 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 


3 


0 


Base 


720/576/30 


10 368 000 


15 


1 835 008 


MP@H-14 




1 


Spatial 


1440/1152/60 


47 001 600 


40 


4 882 432 


Spt@H.14 




2 


SNR 


1440/1152/30 


47 001 600 


60 


7 340 032 


Spt@H-14 



Table £-32. High profile @ Main level [Base Layer] 



No. of 


layer 


Scalable 


Chroma 


Maximum 


Maximum 


Maximum 


Maximum 


Profile 


layers 


id 


mode 


Format 


sample 


sample 


total bit 


total VBV 


and level 










density 


rate 


rate 


buffer 


indication 










(H/V/F) 




/lOOOOOO 






1 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


20 


2441216 

• 


HP@ML 


1 


0 


Base 


4:2:2 


720/576/30 


11059 200 


20 


2 441216 


HP@ML 
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Table £-33. High profile @ Main level (Base Layer + SNR) 



No. of 
layers 


liivpr 

id 


mode 


Chrnma 

Format 


IVf 9TimTim 

X WM. fl fti 1 1 1 

sample 
density 

(WVfF) 


Maximum 
sample 
rate 


total bit 

rate 
/lOOOOOO 


IVf nTIYYllllTl 
ITAaAIllllUlJ 

total VBV 
buffer 


A ruiue 

and level 
indication 


2 


0 
1 


Base 
SNR 


4:2:0 
4:2:0 


720/576/30 
720/576/30 


10 368 
10 368 


000 


) 


15 
20 


1 835 008 

2 441 216 


SP@ML 
HP@ML 


2 


0 
1 


Base 
SNR 


4:2:0 
4:2:2 


720/576/30 
720/576/30 


10 368 000 
10 368 000 


15 
20 


1 835 008 

2 441 216 


SP@ML 
HP@ML 


2 


0 

1 


Base 

SNR 


4:2:0 
4:2:0 


352/288/30 

352/288/30 


3 041 280 

3 041 280 


4 

20 


475 136 

2 441 216 


MP@LL 

HP@ML 


2 


0 

1 


Base 
SNR 


4:2:0 
4:2:2 


352/288/30 
352/288/30 


3 041 280 
3 041 280 


4 
20 


475 136 
2441 216 


MP@LL 
HP@ML 


2 


0 

1 


Base 
SNR 


4:2:0 
4:2:0 


720/576/30 
720/576/30 


10 368 
10 368 


000 


\ 


15 
20 


1 835 008 
2441 216 


MP@ML 
HP@ML 


2 


0 

1 


Base 
SNR 


4:2:0 
4:2:2 


720/576/30 
720/576/30 


10 368 1 
10 3681 


000 




15 
20 


1 835 008 
2441 216 


MP@ML 

HP@ML 


2 


0 

1 


. Base 
SNR 


4:2:0 
4:2:0 


720/576/30 
720/576/30 


14 745 600 
14 745 600 


15 

20 . 


1 835 008 
2441 216 


HP@ML 
HP@ML 


2 


0 

1 


Base 
SNR 


4:2:2 
4:2:2 


720/576/30 
720/576/30 


1 1 059 200 
11059 200. 


15 
20 


1 835 008 

2 441 216 


HP@ML 
HP@ML 



Table £-34. High profile @ Main level (Base Layer + Spatial) 



No. of 
layers 


layer 
id 


Scalable 
mode 


Chroma 
Format 


Maximmn 


Maximum 
sample 
rate 


Maximmn 


Maximum 


Profile 
and level 
indication 


sample 
density 

(H/V/F) 


total bit 

rate 
/lOOOOOO 


total VBV 
buffer 


2 


0 


Base 
Spatial 


4:2:0 
4:2:0 


352/288/30 
720/576/30 


2 534 400 
14 745 600 


1.856 
20 


327 680 
2 441 216 


ISO 11172 
HP@ML 


2 


0 


Base 
Spatial 


4:2:0 
4:2:2 


352/288/30 
720/576/30 


2 534 400 
1 1 059 200 


L856 
20 


327 680 
2441 216 


ISO 11172 
HP@ML 


2 


0 


Base 
Spatial 


4:2:0 
4:2:0 


352/288/30 
720/576/30 


3 041 280 
14 745 600 


4 

20 


475 136 
2 441216 


SP@ML 
HP@ML 


2 


0 


Base 
Spatial 


4:2:0 
4:2:2 


352/288/30 
720/576/30 


3 041 280 
11059 200 


4 

20 


475 136 
2 441 216 


SP@ML 
HP@ML 


2 


0 


Base 
Spatial 


4:2:0 
4:2:0 


352/288/30 
720/576/30 


3 041 280 
14 745 600 


4 
20 


475 136 
2 441 216 


MP@LL 
HP@ML 


2 


0 


Base 
Spatial 


4:2:0 
4:2:2 


352/288/30 
720/576/30 


3 041 280 
1 1 059 200 


4 

20 


475 136 
2441216 


MP@LL 
HP@ML 
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Table £-35. High profile @ Main level (Base Layer + SNR + Spatial) 



No. of 
layers 


layer 
id 


Scalable 
mode 


Chroma 
Format 


Maximum 
sample 
density 

(H/V/F) 


Maximmn 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


3 


0 
1 
2 


Base 
SNR 
Spatial 


4:2:0 
4:2:0 
4:2:0 


352/288/30 
352/288/30 
720/576/30 


3 041 280 
3 041 280 
14 745 600 


3 
4 

20 


360 448 
475 136 
2 441 216 


SP@ML 
SNR@LL 
HP@ML 


3 


0 
1 
2 


Base 
SNR 
Spatial 


4:2:0 
4:2:0 
4:2:2 


352/288/30 
352/288/30 
720/576/30 


3 041 280 
3 041 280 
1 1 059 200 


3 
4 
20 


360 448 
475 136 
2 441 216 


SP@ML 
SNR@LL 
HP@ML 


3 


0 
1 

2 


Base 
SNR 
Spatial 


4:2:0 
4:2:0 
4:2:0 


352/288/30 
352/288/30 
720/576/30 


3 041 280 
3 041 280 
14 745 600 


3 
4 
20 


360448 
475 136 
2 441 216 


MP@LL 
SNR@LL 
HP@ML 


3 


0 
1 
2 


Base 
SNR 
Spatial 


4:2:0 
4:2:0 
4:2:2 


352/288/30 
352/288/30 
720/576/30 


3 041 280 
3 041 280 
1 1 059 200 


3 
4 
20 


360448 
475 136 
2441 216 


MP@LL 

SNR@LL 

HP@ML 
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Table £-36. High profile @ Main level (Base Layer + Spatial + SNR) 



No, of 
lavers 


layer 
id 


Scalable 
mode 


Chroma 
Format 


Maximum 
samole 
density 

(H/VfF) 


Maximum 
samnle 
rate 


MflYifnum 

total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


iT'onle 
and level 
indication 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


720/576/30 


14 745 600 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:0 


720/576/30 


14 745 600 


20 


2 441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


720/576/30 


11059 200 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


20 


2441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:2 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


20 


2441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


SP@ML 




1 


Spatial 


4:2:0 


720/576/30 


14 745 600 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:0 


720/576/30 


14 745 600 


20 


2441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


SP@ML 




1 


Spatial 


4:2:0 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


20 


2441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352^88/30 


3 041 280 


4 


475 136 


SP@ML 




1 


Spatial 


4:2:2 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


11 059 200 


20 


2441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


720/576/30 


14 745 600 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:0 


720/576/30 


14 745 600 


20 


2 441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


20 


2441 216 


HP@ML 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:2 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


11059 200 


20 


2441216 


HP@NfL 



Table £-37. High profile @ High-1440 level [Base Layer] 



No. of 
layers 



laye 
rid 



Scalable 
mode 



Chroma 
Format 



Maximun 
sample 
density 

(H/V/F) 



Maximum 
sample 
rate 



Maximum 
total bit 

rate 
/lOOOOOO 



Maximum 
total VBV 
buffer 



Profile 
and level 
indication 



1 



0 



Base 



4:2:0 



1440/1152/60 



62 668 800 



80 



9781 248 



HP@H-14 



1 



0 



Base 



4:2:2 



1440/1152/60 



47 001 600 



80 



9781248 



HP@H-14 
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Table E-38. High profile @ High-1440 level (Base Layer + SNR) 



Nft. of 

layers 


lajrC 

rid 


mode 


Format 


ATimnm 

sample 
density 

(Ejy/F) 


1^ ATI fTHim 

sample 
rate 


l^flTiTniiTin 

total bit 

rate 
/lOOOOOO 


total VBV 
buffer 

* 


Prnfilp 

and level 
indication 


2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


80 


9 781 248 


HP@H-14 






2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


SNR 


4:2:2 


720/576/30 


Xvf JxfO V/VA/ 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


SNR 


4:2:0 


352/288/30 


3 041 280 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


SNR 


4:2:2 


352/288/30 


3 041 280 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 








1 


SNR 


4:2:2 


720/576/30 


10 368 000 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


MP@H-14 




1 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H.14 


2 


0 


Base 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


MP@H-14 




1 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


20 


1 835 008 


HP@ML 




1 


SNR 


4:2:0 


720/576/30 


14 745 600 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


20 


1 835 008 


HP@ML 




1 


SNR 


4:2:2 


720/576/30 


14 745 600 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:2 


720/576/30 


1 1 059 200 


20 


1 835 008 


HP@ML 




1 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


1440/1152/60 


62 668 800 


60 


7 340 032 


HP@H-14 




1 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




1 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


2 


0 


Base 


4:2:2 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




1 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 
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Table E-39. High profile @ Higli-1440 level (Base Layer + Spatial) 



layers 


1 OVA 

I aye 
rid 


iMr2ilaUie 

mode 


Format 


ITMMXIITI ILQl 

sample 
density 

(EfVrF) 


IMi o VI miim 
iTi a jiiiiimii 

sample 
rate 


total bit 

rate 
/lOOOOOO 


iTl a XI in luu 

total VBV 
buffer 


X roiiie 

and level 
indication 


2 


0 


Base 
Spatial 


4:2:0 
4:2:0 


352/288/30 
1440/1152/60 


2 534 400 
62 668 800 


1.856 
80 


327 680 
9 781 248 


ISO 11172 
HP@H-14 


2 


0 


Base 
Spatial 


4:2:0 
4:2:2 


352/288/30 
1440/1152/60 


2 534 400 
47 001 600 


1.856 
80 


327 680 
9 781 248 


ISO 11172 
HP@H.14 


2 


0 


Base 
Spatial 


4:2:0 

4:2:0 


720/576/30 
1440/1152/60 


10 368 000 

62 668 800 


15 

80 


1 835 008 

9 781 248 


SP@ML 
HP@H-14 


2 




Base 
Spatial 


4:2:0 
4:2:2 


720/576/30 
1440/1152/60 


10 368 000 
47 001 600 


15 
80 


1 835 008 
9 781 248 


SP@ML 
HP@H-14 


2 




Base 
Spatial 


4:2:0 
4:2:0 


352/288/30 
1440/1152/60 


3 041 280 

62 668 800 


4 

80 


475 136 
9 781 248 


MP@LL 
HP@H-14 


2 




Base 

Spatial 


4:2:0 

4:2:2 


352/288/30 
1440/1152/60 


3 041 280 

47 001 600 


4 

80 


475 136 
9 781 248 


MP@LL 
HP@H-14 


2 




Base 
Spatial 


4:2:0 
4:2:0 


720/576/30 
1440/1152/60 


10 368 000 
62 668 800 


15 
80 


1 835 008 
9 781 248 


MP@ML 
HP@H-14 


2 




Base 
Spatial 


4:2:0 
4:2:2 


720/576/30 
1440/1152/60 


10 368 000 
47 001 600 


15 
80 


1 835 008 
9 781 248 


MP@ML 
HP@H-14 


2 




Base 
Spatial 


4:2:0 
4:2:0 


720/576/30 
1440/1152/60 


14 745 600 
62 668 800 


20 
80 


2 441 216 
9 781 248 


MP@H-14 
HP@H-14 


2 




Base 
Spatial 


4:2:0 
4:2:2 


720/576/30 
1440/1152/60 


14 745 600 
47 001 600 


20 
80 


2 441 216 
9 781 248 


MP@H.14 
HP@H-14 


2 




Base 
Spatial 


4:2:0 
4:2:0 


720/576/30 
1440/1152/60 


14 745 600 
62 668 800 


20 
80 


2 441 216 
9 781 248 


HP@ML 
HP@H-14 


2 


0 


Base 
Spatial 


4:2:0 
4:2:2 


720/576/30 
1440/1152/60 


14 745 600 
47 001 600 


20 
80 


2 441 216 
9 781 248 


HP@ML 
HP@H-14 


2 


0 


Base 
Spatial 


4:2:2 
4:2:2 


720/576/30 
1440/1152/60 


1 1 059 200 
47 001 600 


20 
80 


2 441 216 
9 781 248 


HP@ML 
HP@H-14 
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Table High profile @ High-1440 level (Base Layer + SNR + Spatial) 



No. of 


laye 
r id 


Scalable 


Chroma 




Maximmn 




Maximum 


Profile 


Maximum 


Maximum 




in nHp 


Knmiflt 


ITllllllf 1^ 

density 
(H/y/¥) 


^mnle 

rate 


total bit 

rate 
/lOOOOOO 


total VRV 
buffer 


snH IpvpI 

indication 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


3 


360448 


SP@ML 




1 


SNR 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


SNR@LL 




2 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


3 


360 448 


SP@ML 




1 


SNR 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


SNR@LL 




2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


10 


1 212416 


SP@ML 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SNR@ML 




2 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


10 


1 212 416 


SP@ML 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SNR@ML 




2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


3 


360448 


MP@LL 




1 


SNR 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


SNR@LL 




2 


Spatial 


4:2:0 


1440/1152/60 


^O £.iiO OAA 

62 DOG oOO 


OA 

oO 


9 Vol 248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


3 


360448 


MP@LL 




1 


SNR 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


SNR@LL 




2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


10 


1 212416 


MP@ML 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SNR@ML 




2 


Spatial 

* 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10368 000 


10 


1 212416 


MP@ML 
SNR@ML 

^^^^ 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 




2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


NfP@ML 




1 


SNR 


4:2:2 


720/576/30 


10 368 000 


20 


2 441 216 


HP@ML 


• 


2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


15 


1 835 008 


HP@ML 




1 


SNR 


4:2:0 


720/576/30 


14 745 600 


20 


2 441 216 


HP@ML 




2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP@ML 




1 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


20 


2 441216 


HP@ML 




2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:2 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP@ML 




1 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


20 


2 441 216 


HP@ML 




2 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 
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Table £-41 — High profile @ High-1440 level (Base Layer + Spatial + SNR) 



layers 


laye 

rid 


oColaDie 

mode 


VfQlUIIlll 

Format 


IVJiULUDUIIl 

sample 
density 

(H/V/F) 


iviaximum 
sample 
rate 


jriaximmii 
total bit 

rate 
/lOOOOOO 


iviaxunum 
total VBV 
buffer 


iTotiie 
and level 
indication 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.8S6 


327 680 


ISO 11172 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H.14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781248 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248. 


HP@H-14 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 
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3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 



Table EMI — High profile @ Higli-1440 level (Base Layer + Spatial + SNR) (concluded) 



INO» Of 

layers 


laye 
rid 


ocaiaoie 
mode 


cnroma 
Format 


Maximum 


Hif ; 

ivlflYifniim 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 


Profile 
and level 
indication 


sample 
density 

(H/V/F) 


sample 
rate 


total VBV 
buffer 


3 


0 


Base 


4:2:0 




10 368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@NfL 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


10368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


20 


2 441 216 


HP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


60 


7 340 032 


HP@H.14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 




14 745 600 


20 


2441 216 


HP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


20 


2 441 216 


HP@ML 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 


3 


0 


Base 


4:2:2 


720/576/30 


1 1 059 200 


20 


2 441 216 


HP@ML 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


60 


7 340 032 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 



Table £-42. High profile @ High level [Base Layer] 



No. of 
layers 



laye 
rid 



Scalable 
mode 



Chroma 
Format 



Maximum 
sample 

density 

(H/V/F) 



Maximum 
sample 
rate 



Maximum 
total bit 

rate 
/lOOOOOO 



Mai 
total VBV 

buffer 



Profile 
and level 
indication 



1 



0 



Base 



4:2:0 



1920/1152/60 



83 558 400 



100 



12 222 464 



HP@HL 



1 



0 



Base 



4:2:2 



1920/1152/60 



62 668 800 



100 



12 222 464 



HP@HL 
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Table £-43. High profile @ High level (Base Layer + SNR) 



layers 


rid 


i9C 21111 i/lC 

mode 


V^LU UllUl 

Format 


A Ti TnniTi 

' » ' M * ■ 111 mil 

sample 
density 

(E/y/¥) 


sample 
rate 


l^nTITTITITn 
±JA a A 1 1 1 1 mjj 

total bit 

rate 
/lOOOOOO 


irM iMAttU IMAMM 

total VBV 
buffer 


Prnfilp 

and level 
indication 


2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


SNR 


4:2:2 


720/576/30 


10 368 000 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


SNR 


4:2:0 


352/288/30 


3 041 280 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


SNR 


4:2:2 


352/288/30 


3 041 280 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


SNR 


4:2:0 


720/576/30 


10 368 000 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


SNR 


4:2:2 


720/576/30 


10 368 000 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


MP@H-14 




1 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


MP@H-14 




1 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1920/1152/60 


62 668 800 


80 


9 781 248 


MP@HL 




1 


SNR 


4:2:0 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1920/1152/60 


62 668 800 


80 


9 781 248 


MP@HL 




1 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


20 


1 835 008 


HP@ML 




1 


SNR 


4:2:0 


720/576/30 


14 745 600 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


20 


1 835 008 


HP@ML 




1 


SNR 


4:2:2 


720/576/30 


14 745 600 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:2 


720/576/30 


11059200 


20 


1 835 008 


HP@ML 




1 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




1 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




1 


SNR 


4:2:2 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 




1 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1920/1152/60 


83 558 400 


80 


9 781 248 


HP@HL 




1 


SNR 


4:2:0 


1920/1152/60 


83 558 400 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:0 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 
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1 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


2 


0 


Base 


4:2:2 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




1 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 
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Table E-44. High profiie @ High level (Base Layer + Spatial) 



l^Ua Ul 

layers 


r id 


mode 


Uilla 

Format 


iVI H.I Ulll 

sample 
density 

(WV/F) 


1\^ot4 miiin 
|Ti Miimiii 

sample 
rate 


iV^ O 'VI in TITTI 
ITIM XI lllillll 

total bit 

rate 
/lOOOOOO 


lYl H Xlllllllll 

total VBV 
buffer 


and level 
indication 


2 


0 


Base 
Spatial 


4:2:0 
4:2:0 


352/288/30 
1920/1152/60 


2 534 400 
83 558 400 


1.856 
100 


327 680 
12 222 464 


ISO 11172 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:2 


352/288/30 
1920/1152/60 


2 534 400 

62 668 800 


1.856 
100 


327 680 

12 222 464 


ISO 11172 
HP@HL 


2 




Base 

Spatial 


4:2:0 

4:2:0 


720/576/30 

1920/1152/60 


10368 000 

83 558 400 


15 
100 


1 835 008 

12 222 464 


SP@ML 

HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:2 


720/576/30 
1920/1152/60 


10 368 000 
62 668 800 


15 
100 


1 835 008 
12 222 464 


SP@ML 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:0 


352/288/30 
1920/1152/60 


3 041 280 
83 558 400 


4 

100 


475 136 
12 222 464 


MP@LL 
HP@HL 


2 




Base 

Spatial 


4:2:0 
4:2:2 


352/288/30 
1920/1152/60 


3 04 1 280 
62 668 800 


4 

100 


475 136 
12 222 464 


MP@LL 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:0 


720/576/30 
1920/1152/60 


10 368 000 
83 558 400 


15 
100 


1 835 008 
12 222 464 


MP@ML 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:2 


720/576/30 
1920/1152/60 


10 368 000 
62 668 800 


15 
100 


1 835 008 
12 222 464 


MP@ML 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:0 


960/576/30 
1920/1152/60 


19 660 800 
83 558 400 


25 
100 


3 047 424 
12 222 464 


MP@H-14 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:2 


960/576/30 
1920/1152/60 


19 660 800 
62 668 800 


25 
100 


3 047 424 
12 222 464 


MP@H.14 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:0 


720/576/30 
1920/1152/60 


14 745 600 
83 558 400 


20 
100 


2 441 216 
12 222 464 


HP@ML 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:2 


720/576/30 
1920/1152/60 


14 745 600 

62 668 800 


20 
100 


2 441216 
12 222 464 


HP@ML 
HP@HL 


2 




Base 
Spatial 


4:2:2 
4:2:2 


720/576/30 
1920/1152/60 


1 1 059 200 
62 668 800 


20 
100 


2 441 216 
12 222 464 


HP@ML 
HP@HL 


2 




Base 
Spatial 


4:2:0 
4:2:0 


960/576/30 
1920/1152/60 


19 660 800 
83 558 400 


25 
100 


3 047 424 
12 222 464 


HP@H-14 
HP@HL 


2 


0 


Base 
Spatial 


4:2:0 
4:2:2 


960/576/30 
1920/1152/60 


19 660 800 
62 668 800 


25 
100 


3 047 424 
12 222 464 


HP@H-14 
HP@HL 


2 


0 


Base 
Spatial 


4:2:2 
4:2:2 


960/576/30 
1920/1152/60 


14 745 600 
62 668 800 


25 
100 


3 047 424 
12 222 464 


HP@H-14 
HP@HL 
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Table £-45. High profile @ High level (Base Layer + SNR + Spatial) 



No. of 
layers 



laye 
rid 



Scalable 
mode 



Chroma 
Format 



Maximmn 
sample 
density 

(E/y/F) 



Maximmn 
sample 
rate 



Maximum 
total bit 

rate 
/lOOOOOO 



Maximum 
total VBV 
buffer 



Profile 
and level 
indication 



0 
1 

2 



0 
1 

2 



Base 
SNR 
Spatial 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:0 



4:2:0 
4:2:0 
4:2:2 



352/288/30 
352/288/30 
1920/1152/60 



3 041 280 
3 041 280 
83 558 400 



3 
4 

100 



360 448 
475 136 
12 222 464 



352/288/30 
352/288/30 
1920/1152/60 



3 041 280 
3 041 280 
83 558 400 



3 
4 

100 



360 448 
475 136 
12 222 464 



SP@ML 
SNR@LL 
HP@HL 



SP@ML 
SNR@LL 
HP@HL 



0 
1 
2 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:0 



720/576/30 
720/576/30 
1920/1152/60 



10 368 000 
10 368 000 
83 558 400 



10 
15 
100 



1212 416 
1 835 008 
12 222 464 



SP@ML 
SNR@ML 
HP@HL 



0 
1 

2 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:2 



720/576/30 
720/576/30 
1920/1152/60 



10 368 000 
10 368 000 
62 668 800 



10 
15 

100 



1212 416 
1 835 008 
12 222 464 



SP@ML 
SNR@ML 
HP@HL 



0 
1 

2 



Base 
SNR 
Spatial 



4:2:0 
4:2:2 
4:2:2 



720/576/30 
720/576/30 
1920/1152/60 



10 368 000 
10 368 000 
62 668 800 



15 
20 

100 



1 835 008 

2 441 216 
12 222 464 



SP@ML 
HP@ML 
HP@HL 



0 
1 

2 



Base 
SNR 

Spatial 



4:2:0 
4:2:0 
4:2:0 



352/288/30 
352/288/30 
1920/1152/60 



3 041 280 

3 041 280 
83 558 400 



3 
4 

100 



360448 
475 136 
12 222 464 



MP@LL 
SNR@LL 
HP@HL 



0 
1 

2 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:2 



352/288/30 
352/288/30 
1920/1152/60 



3 041 280 
3 041 280 

62 668 800 



3 
4 

100 



360448 
475 136 

12 222 464 



MP@LL 
SNR@LL 
HP@HL 



0 
1 

2 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:0 



720/576/30 
720/576/30 
1920/1152/60 



10 368 000 
10 368 000 
83 558 400 



10 
15 
100 



1 212 416 
1 835 008 
12 222 464 



MP@ML 
SNR@ML 
HP@HL 



0 
1 
2 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:2 



720/576/30 
720/576/30 
1920/1152/60 



10 368 000 
10 368 000 
62 668 800 



10 
15 
100 



1212416 
1 835 008 
12 222 464 



MP@ML 
SNR@ML 
HP@HL 



0 
1 
2 



Base 
SNR 
Spatial 



4:2:0 
4:2:2 
4:2:2 



720/576/30 
720/576/30 
1920/1152/60 



10368 000 
10 368 000 
62 668 800 



15 
20 
100 



1 835 008 

2 441 216 
12 222 464 



MP@ML 
HP@ML 
HP@HL 



0 
1 
2 



0 
1 
2 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:0 



960/576/30 
960/576/30 
1920/1152/60 



19 660 800 
19 660 800 
83 558 400 



20 
25 
100 



2 441 216 

3 047 424 
12 222 464 



MP@H-14 
Spt@H-14 
HP@HL 



Base 
SNR 
Spatial 



4:2:0 
4:2:0 
4:2:2 



960/576/30 
960/576/30 
1920/1152/60 



19 660 800 
19 660 800 
62 668 800 



20 
25 
100 



2 441 216 

3 047 424 
12 222 464 



MP@H-14 
Spt@H-14 
HP@HL 
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3 


0 


Base 


4:2:0 


960/576/30 


14 745 600 


20 


2 441 216 


MP@H-14 




1 


SNR 


4:2:2 


960/576/30 


14 745 600 


25 


3 047 424 


HP@H.14 




2 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 



Table £^5. High profile @ High level (Base Layer + SNR + Spatial) (concluded) 



Na of 
ittyeis 


laye 
r 111 


Scalable 

miHic 


Chroma 


Maximum 

attllipiC 

density 

(H/V/F) 


Maximum 

covnnlo 

rate 


Maximum 

lUUtl UIL 

rate 
/1 000000 


Maximum 

tntnl VRV 

buffer 


Profile 

Uflll IcVcl 

indication 


3 


0 


Base 


4:2:0 


720/576/30 


14 745 600 


15 


1 835 008 


HP(S)ML 




1 


SNR 


4:2:0 




14 745 600 


20 


2 441 216 


HP@ML 




2 


Spatial 


4:2:0 


1920/1152/60 


83 558 400 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 




14 745 600 


15 


1 835 008 


HP(a)AlL 




1 


SNR 


4:2:0 




14 745 600 


20 


2 441 216 


HP@ML 




2 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP®ML 




1 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


20 


2 441 216 


HP@ML 




2 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4-2-2 


720/576/30 


1 1 059 200 


15 


1 835 008 


HP(2)ML 




1 


SNR 


4:2:2 




1 1 059 200 


20 


2 441 216 


HP@ML 




2 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


960/576/30 


19 660 800 


20 


2 441 216 


HP@H-14 




1 


SNR 


4:2:0 


960/576/30 


19 660 800 


25 


3 047 424 


HP@H-14 




2 


Spatial 


4:2:0 


1920/1152/60 


83 558 400 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


960/576/30 


19 660 800 


20 


2 441216 


HP@H-14 




1 


SNR 


4:2:0 


960/576/30 


19 660 800 


25 


3 047 424 


HP@H-14 




2 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


960/576/30 


14 745 600 


20 


2 441216 


HP@H.14 




1 


SNR 


4:2:2 


960/576/30 


14 745 600 


25 


3 047 424 


HP@H-14 




2 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:2 


960/576/30 


14 745 600 


20 


2 441 216 


HP@H-14 




1 


SNR 


4:2:2 


960/576/30 


14 745 600 


25 


3 047 424 


HP@H-14 




2 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 
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Table E-46. High profile @ High level (Base Layer + Spatial + SNR) 



INO. 01 

layers 


laye 
r id 


c»caiaDie 
mode 


cnroma 
Format 


IVi fl Ylmilm 

sample 
density 

(EJVfF) 


jviaxuiiimi 
sample 
rate 


jviaximnm 
total bit 

rate 
/lOOOOOO 


MaTimnm 
total VBV 
buffer 


iToiue 
and level 
indication 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


720/576/30 


14 745 600 


20 


2 441216 


HP@ML 




2 


SNR 


4:2:0 


720/576/30 


14 745 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


720/576/30 


14 745 600 


20 


2 441 216 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


14 745 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:2 


720/576/30 


1 1 059 200 


20 


2 441 216 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1920/1152/60 


83 558 400 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:0 


1920/1152/60 


83 558 400 


100 


12 222464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:0 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


2 534 400 


1.856 


327 680 


ISO 11172 




1 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL . 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 
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3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 
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Table E-46. High proOle @ High level (Base Layer + Spatial + SNR) (continued) 



fNO. 01 

layers 


laye 
r id 


ocaiaDie 
mode 


i^nroma 
Format 


iviamTTium 
sample 
density 

(H/y/F) 


iviaxiniuin 
sample 
rate 


iVianrrmm 

total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


inronie 
and level 
indication 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1920/1152/60 


83 558 400 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:0 


1920/1152/60 


83 558 400 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:0 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


SP@ML 




1 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


Spt@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


720/576/30 


14 745 600 


20 


2441 216 


HP@ML 




2 


SNR 


4:2:0 


720/576/30 


14 745 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


720/576/30 


14 745 600 


20 


2 441 216 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


14 745 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:2 


720/576/30 


1 1 059 200 


20 


2 441 216 


HP@ML 




2 


SNR 


4:2:2 


720/576/30 


1 1 059 200 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 
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3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 
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Table £-46. High profile @ High level (Base Layer + Spatial + SNR) (continued) 



INO. 01 

layers 


laye 
rid 


dcaiaoie 
mode 


v^nroma 
Format 


lYi axun uni 
sample 
density 

(HfV/F) 


iVI UAIUlUin 

sample 
rate 


ivi aXi ni lun 
total bit 

rate 
/lOOOOOO 


iviaxuQuin 
total VBV 
buffer 


iTonie 
and level 
indication 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 . 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1920/1152/60 


83 558 400 


80 


9 781 248 


HP@HL 




0 


SNR 


4-2-0 


1 Q20/1 1 52/60 


OaJ •/•JO ^\i\f 


100 


1 2 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


352/288/30 


3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:0 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 






SNR 




1920/1 152/60 


62 800 

wo Ovv 


100 


1 2 222 464 


.HP@HL 


3 


0 


Base 


4:2:0 




3 041 280 


4 


475 136 


MP@LL 




1 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 




10 368 000 


15 


1 835 008 


MP@ML 
Spt@H-14 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 




2 


SNR 


4:2:0 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 




0 


Base 


4:2:0 






15 


1 835 008 


MP@ML 
Spt@H-14 


3 


10368000 




1 


Spatial 


4:2:0 


1440/1152/60 


47 001 600 


60 


7 340 032 


• 


2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 




10 368 000 


15 


1 835 008 


MP@NiL 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:0 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:0 


1440/1152/60 


62 668 800 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:2 


1440/1152/60 


47 001 600 


80 


9 781 248 


HP@H-14 




2 


SNR 


4:2:2 


1440/1152/60 


47 001 600 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:0 


1920/1152/60 


83 558 400 


80 


9 781248 


HP@HL 




2 


SNR 


4:2:0 


1920/1152/60 


83 558 400 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368 000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:0 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


720/576/30 


10 368000 


15 


1 835 008 


MP@ML 




1 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 
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3 


0 


Base 


4:2:0 


960/576/30 


19 660 800 


25 


3 047 424 


HP@H-14 




1 


Spatial 


4:2:0 


1920/1152/60 


83 558 400 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:0 


1920/1152/60 


83 558 400 


100 


12 222 464 


HP@HL 




Table £-46. High profile @ High level (B 


ase Layer + Spatial + SNR) (concluded) 




No. of 
layers 


rid 


Scalable 
mode 


Chroma 
Format 


M axtmum 
sample 
density 

(H/V/F) 


Maximum 
sample 
rate 


Maximum 
total bit 

rate 
/lOOOOOO 


Maximum 
total VBV 
buffer 


Profile 
and level 
indication 


3 


0 


Base 


4:2:0 


960/576/30 


19 660 800 


25 


3 047 424 


HP@H-14 




1 


Spatial 


4:2:0 


1920/1152/60 


62 668 800 


80 


9781248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:0 


960/576/30 


19 660 800 


25 


3 047 424 


HP@H-14 




1 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 


3 


0 


Base 


4:2:2 


960/576/30 


14 745 600 


25 


3 047 424 


HP@H-14 




1 


Spatial 


4:2:2 


1920/1152/60 


62 668 800 


80 


9 781 248 


HP@HL 




2 


SNR 


4:2:2 


1920/1152/60 


62 668 800 


100 


12 222 464 


HP@HL 
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Annex F 
Patent statements 

(This annex does not fonn an integral part of this Recommendation | International Standard) 

The user's attention is called to the possibility that, for some of the processes specified in this part of 
ISO/IEC 13818, conformance with this specification may require iise of an invention covered by patent 
rights. 

By publication of this part of ISO/IEC 13818, no position is taken with respect to the validity of this claim 
or of any patent rights in connection therewith. However, each company listed in this Annex has 
undertaken to file with the Information Technology Task Force (ITTF) a statement of willingness to grant 
a license under such rights that they hold on reasonable and non-discriminatory terms and conditions to 
applicants desiring to obtain such a license. 

Information regarding such patents can be obtained firom the following organisations. 

The table summarises the formal patent statements received and indicates the parts of the standard to 
which the statement applies. The list includes all organisations that have submitted informal patent 
statements. However, if no "X" is present, no formal patent statement has yet been received fi'om that 
organisation. 
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Company 


V 


A 


s 


AT&T 


X 


X 


X 


BBC Research Dqjartment 


X 




X 


Bellcore 


X 






Belgian Science Policy Office 


X 






BOSCH 


X 


X 


X 


British Telecommunications 








CCETT 








Columbia Lfaiiversity in the City of New York 


X 






CSELT 


X 






David Samoff Research Center 


X 


X 


X 


Deutsche Thomson-Brandt GmbH 


X 


X 


X 


France Telecom CNET 


X 






Fraunhofer Gesellschail 




X 


X 


Fujitsu 


X 


X 


X 


GC Technology Corporation 


X 


X 


X 


General Instruments 


X 






Goldstar 


X 


X 


X 


Hitachi, Ltd. 


X 






International Business Machines CorpoTation 


X 


X 


X 


IRT 




X 




KDD 


X 






Massachusetts Institute of Technology 


X 


X 


X 


Matsushita Electric Industrial Co., Ltd. 


X 


X 


X 


Mitsubishi Electric Corporation 
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Company 


V 


A 


s 


National TransccmmiimicatiQns liinited . 


X 






NEC Corporation 


X 


X 




Nippon Hoso Kyokai 


X 






Nippon Telegraph and Telephone 


X 






Nokia Researdi Center 


X 






Norwegian Telecom Research 


X 






Philips Consume Electronics 


X 


X 


X 


OKI 


X 






Qualcomm Incorporated 


X 






Royal PTT Nederland N.V., PTT Research (NL) 


X 


X 


X 


Samsung Electronics 


X 


X 


X 


ScientiiSc Atlanta 


X 


X 


X 


Siemens AG 


X 






Sharp Corporation 


X 


X 


X 


Sony Corporation 








Texas Instruments 








Thomson Consumer Electronics 


X 






Toshiba Coqxiration 


X 






TV/Com 


X 


X 


X 


Victor Company of Japan Limited 


X 


X 


X 
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